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1 

DESCRIPTION 

Development of Novel Anti-Microbial Agents Based on Bacteriophage Genomics 

5 BACKGROUND OF THE INVENTION 

The present invention relates to the field of antibacterial agents and the 
treatment of infections of animals or other complex organisms by bacteria. 

1 0 The frequency and spectrum of antibiotic-resistant infections have, in recent 

years, increased in both the hospital and community. Certain infections have become 
essentially unbeatable and are growing to epidemic proportions in the developing 
world as well as in institutional settings in the developed world. The staggering 
spread of antibiotic resistance in pathogenic bacteria has been attributed to microbial 

15 genetic characteristics, widespread use of antibiotic drugs, and changes in society that 
enhance the transmission of drug-resistant organisms. This spread of drug resistant 
microbes is leading to ever increasing morbidity, mortality and health-care costs. 

Ironically, it is the very success of antibiotics, resulting in their widespread 
use, that has contributed the most to rising numbers of drug resistant bacterial strains. 

20 The longer a bacterial strain is exposed to a drug, the more likely it is to acquire 
resistance. Today, a total of 160 antibiotics, all based on a few basic chemical 
structures and targeting a small number of metabolic pathways, have found their way 
to market. Over-prescription of these drugs, as well as the failure of patients to 
comply with the complete antibiotic regimen, has lead to the rapid emergence of 

25 antibiotic resistant strains. Such misuse of prescriptions, careless use of antibiotics in 
virtually all commercial production of beef and fowl, and changing societal 
conditions, such as the growth of day-care centers, increased long-term care in 
hospitals, and increased mobility of the population, has provided an environment 
where drug-resistant microbes can emerge and spread. Thus, virtually all common 

30 infectious bacteria are becoming, or have already become, resistant to one or more 
groups of antibiotics. Such resistance now reaches all classes of antibiotics currently 
in use, including: P-lactams, fluoroquinolones, aminoglycosides, macrolide peptides, 
chloramphenicol, tetracyclines, rifampicin, folate inhibitors, glycopeptides, and -■• * " 
mupirocin. 

35 Over the last 45 years bacteria have adapted genetically to avoid the 

destruction/alteration of the essential pathways that these chemotherapeutic agents 



WO 00/32825 



PCT/IB99/02040 



target. Antibiotic resistant bacterial strains are now emerging at a higher rate than the 
rate at which new antibiotics are being developed. The consequence of this dilemma 
has been a dramatic increase in the cost of treating infections what would otherwise 
easily succumb to routine antibiotic therapy. Furthermore, and perhaps most 
5 importantly, the emergence of multiple drug resistant pathogenic bacteria has led to a 
significant increase in morbidity and mortality, particularly in institutional settings. 

Most major pharmaceutical companies have on-going drug discovery 
programs for novel anti-microbials. These are based on screens for small molecule 
inhibitors (natural products, bacterial culture media, libraries of small molecules, 

10 combinatorial chemistry) of crucial metabolic pathways of the micro-organism of 
interest (e.g., bacteria, fungi, parasites, worms). The screening process is largely for 
cytotoxic compounds and in most cases is not based on a known mechanism of action 
of the compounds. Pharmaceutical companies have large programs in this area. 
Classical drug screening programs are being exhausted and many of these 

15 pharmaceutical companies are looking towards rational drug design programs. 

Several small to mid-size biotechnology companies as well as large 
pharmaceutical companies have developed systematic high-throughput sequencing 
programs to decipher the genetic code of specific micro-organisms of interest. The 
goal is to identify, through sequencing, unique biochemical pathways or intermediates 

20 that are unique to the microorganism. Knowledge of this may, in turn, form the 
rationale for a drug discovery program based on the mechanism of action of the 
identified enzymes/proteins. Genome Therapeutics Corp., The Institute for Genome 
Research, Human Genome Sciences Inc., and other companies have such sequencing 
programs in place. However, one of the most critical steps in this approach is the 

25 ascertainment that the identified proteins and biochemical pathways are 1) non- 
redundant and essential for bacterial survival, and 2) constitute suitable and accessible 
targets for drug discovery. 
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SUMMARY OF THE INVENTION 

While animals such as humans are, on occasion, infected by pathogenic 
bacteria, bacteria also have natural enemies. A number of host-specific viruses, 
5 known as bacteriophages or phages, infect and kill bacteria in the natural 

environment. Such bacteriophages generally have small compact genomes and 
bacteria are their exclusive hosts. Many known bacteria are host to a large number of 
bacteriophages that have been described in the literature. During the 1940's - 1960's, 
phage biology was an area of active research. As a testimony to this, the study of 
10 phages which infect and inhibit the enteric bacterium Escherichia coli (£. coli) 
contributed much to the early understanding of molecular biology and virology. 

As is generally understood, bacteriophage (or phages) are viruses that infect 
and kill bacteria. They are natural enemies of bacteria and, over the course of 
evolution, have developed proteins (products of DNA sequences) which enable them 

15 to infect a host bacteria, replicate their genetic material, usurp host metabolism, and 
ultimately kill their host. The scientific literature well documents the fact that many 
known bacteria have a large number of such bacteriophages (Ackermann and DuBow, 
1987) that can infect and kill them (for example, see the ATCC bacteriophage 
collection at http://www.atcc.org). 

20 This invention utilizes the observation that bacteriophages successfully infect 

and inhibit or kill host bacteria, targeting a variety of normal host metabolic and 
physiological traits, some of which are shared by all bacteria, pathogenic and 
nonpathogenic alike. The term "pathogenic" as used herein denotes a contribution to 
or implication in disease or a morbid state of an infected organism. The invention 

25 thus involves identifying and elucidating the molecular mechanisms by which phages 
interfere with host bacterial metabolism, an objective being to provide novel targets 
for drug design. Whether the phage blocks bacterial RNA transcription or translation, 
or attacks other important metabolic pathways, such as cell wall assembly or 
membrane integrity, the basic blueprint for a phage's bacteria-inhibiting ability is 

30 encoded in its genome and can be unlocked using bioinformatics, functional 
genomics, and proteomics. By these means, the invention utilizes sequence 
information from the genomics of bacteriophage to identify novel antimicrobials that 
can be further used to actively and/or prophylactically treat bacterial infection. 

Two important components of the invention thus are: i) the identification of 

35 bacteria-inhibiting phage open reading frames ("ORF's) and corresponding products 
that can be used to develop antibiotics based on amino acid sequence and secondary 
structural characteristics of the ORF products, and ii) the use of bacteriophages to map 
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out essential bacterial target genes and homologs, which can in turn lead to the 
development of suitable anti-microbial agents. These two avenues represent new and 
general methods for developing novel antimicrobials. 

The invention thus concerns the identification of bacteriophage ORFs that 
5 supply bacteria-inhibiting functions. In this regard, use of the terms "inhibit", 
"inhibition", "inhibitory", and "inhibitor" all refer to a function of reducing a 
biological activity or function. Such reduction in activity or function can, for 
example, be in connection with a cellular component, e.g., an enzyme, or in 
connection with a cellular process, e.g., synthesis of a particular protein, or in 

10 connection with an overall process of a cell, e.g., cell growth. In reference to bacterial 
cell growth, for example, an inhibitory effect (i.e., a bacteria-inhibiting effect) may be 
bacteriocidal (killing of bacterial cells) or bacteriostatic (i.e., stopping or at least 
slowing bacterial cell growth). The latter slows or prevents cell growth such that 
fewer cells of the strain are produced relative to uninhibited cells over a given period 

1 5 of time. From a molecular standpoint, such inhibition may equate with a reduction in 
the level of, or elimination of, the transcription and/or translation of a specific 
bacterial target(s), or reduction or elimination of activity of a particular target 
biomolecule. 

It is particularly advantageous to evaluate a plurality of different phage ORFs 

20 for inhibitory activity that may be from one, but is preferably from a plurality of 

different phage. For example, evaluating ORFs from a number of different phage of 
the same bacterial host provides at least two advantages. One is that the multiple 
phages will provide identification of a variety of different targets. Second, it is likely 
that multiple phage will utilize the same cellular target 

25 As used herein, the terms "bacteriophage" and "phage" are used 

interchangeably to refer to a virus which can infect a bacterial strain or a number of 
different bacterial strains. 

In the context of this invention, the term "bacteriophage ORF" or ""phage 
ORF" or similar term refers to a nucleotide sequence in or from a bacteriophage. In 

30 connection with a particular ORF, the terms refer an open reading frame which has at 
least 95% sequence identity, preferably at least 97% sequence identity, more 
preferably at least 98% sequence identity with an ORF from the particular phage 
identified herein (e.g., with an ORF as identified herein) or to a nucleic acid sequence 
which has the specified sequence identify percentage with such an ORF sequence, 

35 A first aspect of the invention thus provides a method for identifying a - 

bacteriophage nucleic acid coding region encoding a product active on an essential 
bacterial target by identifying a nucleic acid sequence encoding a gene product which 



WO 00/32825 



PCT/IB99/02040 



provides a bacteria-inhibiting function when the bacteriophage infects a host 
bacterium, preferably one that is an animal or plant pathogen, more preferably a bird 
or mammalian pathogen, and most preferably a human pathogen. The bacteriophage 
is an uncharacterized bacteriophage. Thus, the method excludes, for example, phage 
5 X, <|>xl74, ml 3 and other £.co/*-specific bacteriophage that have been studied with 
respect to gene number and/or function. It also excludes, for example, the nucleic 
acid coding regions described in Tables 12-14, and in preferred embodiments, 
excludes the phage in which those regions are naturally located. 

In connection with bacteriophage, the term "uncharacterized" means that a 

10 certain bacteriophage's genome has not yet been fully identified such that the genes 
having function involved in inhibiting host cells have not been identified. In 
particular, phage for which the description of genomic or protein sequence was first 
provided herein are uncharacterized. Phage sequences for which host bacteria- 
inhibiting functions have been identified prior to the filing of the present application 

15 (or alternatively prior to the present invention) are specifically excluded from the 

aspects involving utilization of sequences from uncharacterized bacteriophage, except 
that aspects may involve a plurality of phage where one or more of those phage are 
uncharacterized and one or more others have been characterized to some extent. A 
number of different bacteria-inhibiting phage ORFs are indicated in Tables 11-14. 

20 The phage ORFs or sequences identified therein are not within the term 

"uncharacterized; alternatively, in preferred embodiments the phage containing those 
ORFs are excluded from this term. Further, any additional phage ORFs (or 
alternatively the phage which contain those ORFs) which have previously been 
described in the art as bacteria-inhibiting ORFs are expressly excluded; those ORFs or 

25 phage are known to those skilled in the art and the exclusion can be made express by 
specifically naming such ORFs or phage as needed (likewise for uncharacterized 
targets as described below). For the sake of brevity, such a listing is not expressly 
presented, as such information is readily available to those skilled in the art. 

Stating that an agent or compound is "active on" a particular cellular target, 

30 such as the product of a particular gene, means that the target is an important part of a 

cellular pathway which includes that target and that the agent acts on that pathway. 

Thus, in some cases the agent may act on a component upstream or downstream of the 

stated target, including on a regulator of that pathway or a component of that pathway. 
By "essential", in connection with a gene or gene product, is meant that the host 

35 cannot survive without, or is significantly growth compromised, in the ajps^nce 

depletion, or alteration of functional product. An "essential gene" is thus one that 

encodes a product that is beneficial, or preferably necessary, for cellular growth in 
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vitro in a medium appropriate for growth of a strain having a wild-type allele 
corresponding to the particular gene in question. Therefore, if an essential gene is 
inactivated or inhibited, that cell will grow significantly more slowly, preferably less 
than 20%, more preferably less than 10%, most preferably less than 5% of the growth 
5 rate of the uninhibited wild-type, or not at all, in the growth medium. Preferably, in 
the absence of activity provided by a product of the gene, the cell will not grow at all 
or will be non-viable, at least under culture conditions similar to the in vivo conditions 
normally encountered by the bacterial cell during an infection. For example, absence 
of the biological activity of certain enzymes involved in bacterial cell wall synthesis 
10 can result in the lysis of cells under normal osmotic conditions, even though 
protoplasts can be maintained under controlled osmotic conditions. In the context of 
the invention, essential genes are generally the preferred targets of antimicrobial 
agents. Essential genes can encode target molecules directly or can encode a product 
involved in the production, modification, or maintenance of a target molecule. 

15 A "target" refers to a biomolecule that can be acted on by an exogenous agent, 

thereby modulating, preferably inhibiting, growth or viability of a cell. In most cases 
such a target will be a nucleic acid sequence or molecule, or a polypeptide or protein. 
However, other types of biomolecules can also be targets, e.g., membrane lipids and 
cell wall structural components. 

20 The term "bacterium" refers to a single bacterial strain, and includes a single 

cell, and a plurality or population of cells of that strain unless clearly indicated to the 
contrary. In reference to bacteria or bacteriophage, the term "strain" refers to bacteria 
or phage having a particular genetic content. The genetic content includes genomic 
content as well as recombinant vectors. Thus, for example, two otherwise identical 

25 bacterial cells would represent different strains if each contained a vector, e.g., a 
plasmid, with different phage ORF inserts. 

In preferred embodiments, the phage is Staphylococcus aureus phage 77, 3A, 
96, or 44 AHJD, Enterococcus sp. phage 182, or Streptococcus pneumoniae phage 
Dp-1. 

30 In preferred embodiments, the phage is selected from. Preferred embodiments 

involve expressing at least one recombinant phage ORF(s) in a bacterial host followed 
by inhibition analysis of that host. Inhibition following expression of the phage ORF . . . T 
is indicative that the product of the ORF is active on an essential bacterial target. 
Such evaluation can be carried out in a variety of different formats, such as on a 

35 support matrix such as a solidified medium in a petri dish, or in liquid culture. 
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Preferably a plurality of phage ORFs are expressed in at least one bacterium. The 
plurality of phage ORFs can be from one or a plurality of phage. With respect to a 
single phage or at least one phage in a plurality of phages, the plurality of expressed 
ORFs preferably represents at least 10%, more preferably at least 20%, 40%, or 60%, 
5 still more preferably at least 80% or 90%, and most preferably at least 95% of the 
ORFs in the phage genome. Preferably, for a plurality of phage, the plurality of 
expressed ORFs preferably represents at least 10%, more preferably at least 20%, 
40%, or 60%, still more preferably at least 80% or 90%, and most preferably at least 
95% of the ORFs in the phage genome of each phage. The plurality of phage ORFs 

10 can be expressed in a single bacterium, or in a plurality of bacteria where one ORF is 
expressed in each bacterium, or in a plurality of bacteria where a plurality of ORFs are 
expressed in at least one or in all of the plurality of bacteria, or combinations of these. 

In embodiments of the above aspect (as well as in other aspects herein) in 
which a plurality of phage are utilized, a plurality of phage have the same bacterial 

15 host species; have different bacterial host species; or both. The plurality of phage 
includes at least two different phage, preferably at least 3,4,5,6,8,10,15,20, or more 
different phage. Indeed, more preferably, the plurality of phage will include 50, 75, 
100, or more phage. As described herein, the larger number of phage is useful to 
provide additional target and target evaluation information useful in developing 

20 antibacterial agents, for example, by providing identification of a larger range of 
bacterial targets, and/or providing further indication of the suitability of a particular 
target (for example, utilization of a target by a number of different unrelated phage 
can suggest that the target is particularly stable and accessible and effective) and/or 
can indicate alternate sites on a target which interact with different inhibitors. 

25 Further embodiments involve confirmation of the inhibitor function of the 

phage ORF, such as by utilizing or incorporating a control(s) designed to confirm the 
inhibitory nature of the ORF(s) being evaluated. The control can, for example, be 
provided by expression of an inactive or partially inactive form of the ORF or ORF 
product, and/or by the absence of expression of the ORF or ORF product in the same 

30 or a closely comparable bacterial strain as that used for expression of the test ORF. 
The reduced level of activity or the absence of active ORF product in the control will 
thus not provide the inhibition provided by a corresponding inhibitory ORF, or will 
provide a distinguishably lower level of inhibition. An inactivated or partially 
inactivated control has a mutation(s), e.g., in the coding region or in flanking 

35 regulatory elements, that reduce(s) or eliminate(s) the normal function of the 0RF7~ 
Thus, the inhibition of a bacterium following expression of a phage ORF is 
determined by comparison with the effects of expression of an inactivated ORF or the 
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response of the bacteria in the absence of expression in the same or similar type 
bacterium. Such determination of inhibition of the bacterium following expression of 
the ORF is indicative of a bacteria-inhibiting function. These manipulations are 
routinely understood and accomplished by those of skill in the art using standard 
5 techniques. In embodiments utilizing absence of expression of the ORF, the bacteria 
can, for example, contain an empty vector or a vector which allows expression of an 
unrelated sequence which is preferably non-inhibitory. Alternatively, the bacteria 
may have no vector at all. Combinations of such controls or other controls may also 
be utilized as recognized by those skilled in the art. 

1 0 In embodiments involving expression of a phage ORF in a bacterial strain, in 

preferred embodiments that expression is inducible. 

By "inducible" is meant that expression is absent or occurs at a low level until 
the occurrence of an appropriate environmental stimulus provides otherwise. For the 
present invention such induction is preferably controlled by an artificial 

15 environmental change, such as by contacting a bacterial strain population with an 
inducing compound (i.e., an inducer). However, induction could also occur, for 
example, in response to build-up of a compound produced by the bacteria in the 
bacterial culture, in the medium. As uncontrolled or constitutive expression of 
inhibitory ORFs can severely compromise bacteria to the point of eradication, such 

20 expression is therefore undesirable in many cases because it would prevent effective 
evaluation of the strain and inhibitor being studied. For example, such uncontrolled 
expression could prevent any growth of the strain following insertion of a 
recombinant ORF, thus preventing determination of effective transfection or 
transformation. A controlled or inducible expression is therefore advantageous and is 

25 generally provided through the provision of suitable regulatory elements, e.g., 

promoter/operator sequences that can be conveniently transcriptionally linked to a 
coding sequence to be evaluated. In most cases, the vector will also contain 
sequences suitable for efficient replication of the vector in the same or different host 
cells and/or sequences allowing selection of cells containing the vector, i.e., 

30 "selectable markers." Further, preferred vectors include convenient primer sequences 
flanking the cloning region from which PCR and/or sequencing may be performed. 

As knowledge of the nucleotide sequence of phage ORFs is useful, e.g., for 
assisting in the identification of phage proteins active against essential bacterial host 
targets, preferred embodiments involve the sequencing of at least a portion of the. .. 

35 phage genome in combination with the above methods. This can be done eitherieFore 
or after or independent of expression and inhibition of the ORF in the bacteria, and 
provides information on the nature and characteristics of the ORF. Such a portion is 
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preferably at least 10%, 20%, 40%, 80%, 90%, or 100% of the phage genome. For 
embodiments in which a plurality of phage are utilized, preferably each phage is 
sequenced to an extent as just specified. 

Such sequencing is preferably accompanied by computer sequence analysis to 
5 define and evaluate ORF(s), ORF products, structural motifs or functional properties 
of ORF products, and/or their genetic control elements. Thus, certain embodiments 
incorporate computer sequence analyses or nucleic acid and/or amino acid sequences. 
Further, existing data banks can provide phage sequence and product information 
which can be utilized for analysis and identification of ORFs in the sequence. 

10 Computer analysis may further employ known homologous sequences from other 
species that suggest or indicate conserved underlying biochemical fimction(s) for the 
inhibitory or potentially inhibitory ORF sequence(s) being evaluated. This can 
include the sequences of signature motifs of identified classes of inhibitors. 

In the context of the phage nucleic acid sequences, e.g., gene sequences, of this 

15 invention, the terms "homolog" and "homologous" denote nucleotide sequences from 
different bacteria or phage strains or species or from other types of organisms that 
have significantly related nucleotide sequences, and consequently significantly related 
encoded gene products, preferably having related function. Homologous gene 
sequences or coding sequences have at least 70% sequence identity (as defined by the 

20 maximal base match in a computer-generated alignment of two or more nucleic acid 
sequences) over at least one sequence window of 48 nucleotides, more preferably at 
least 80 or 85%, still more preferably at least 90%, and most preferably at least 95%. 
The polypeptide products of homologous genes have at least 35% amino acid 
sequence identity over at least one sequence window of 18 amino acid residues, more 

25 preferably at least 40%, still more preferably at least 50% or 60%, and most 
preferably at least 70%, 80%, or 90%. Preferably, the homologous gene product is 
also a functional homolog, meaning that the homolog will functionally complement 
one or more biological activities of the product being compared. For nucleotide or 
amino acid sequence comparisons where a homology is defined by a % sequence 

30 identity, the percentage is determined using BLAST programs ( with default 
parameters (Altschul et al., 1997, "Gapped BLAST and PSI-BLAST: a new 
generation of protein database search programs, Nucleic Acid Res. 25:3389-3402). 
Any of a variety of algorithms known in the art which provide comparable results can 
also be used, preferably using default parameters. Performance characteristics for 

35 three different algorithms in homology searching is described in Salamov et aU 1999, 
"Combining sensitive database searches with multiple intermediates to detect distant 
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10 

homologies." Protein Eng. 12:95-100. Another exemplary program package is the 
GCG™ package from the University of Wisconsin. 

Homo logs may also or in addition be characterized by the ability of two 
complementary nucleic acid strands to hybridize to each other under appropriately 
5 stringent conditions. Hybridizations are typically and preferably conducted with 
probe- length nucleic acid molecules, preferably 20-100 nucleotides in length. Those 
skilled in the art understand how to estimate and adjust the stringency of hybridization 
conditions such that sequences having at least a desired level of complementarity will 
stably hybridize, while those having lower complementarity will not. For examples of 
10 hybridization conditions and parameters, see, eg.,. Maniatis, T. et aL (1989) 

Molecular Cloning: A Laboratory Manual . Cold Spring Harbor University Press, Cold 
Spring, N.Y.; Ausubel, F.M. et al. (1994) Current Protocols in Molecular Biolopv . 
John Wiley & Sons, Secaucus, N.J. Homologs and homologous gene sequences may 
thus be identified using any nucleic acid sequence of interest, including the phage 
1 5 ORFs and bacterial target genes of the present invention, 

A typical hybridization, for example, utilizes, besides the labeled probe of 
interest, a salt solution such as 6xSSC (NaCl and Sodium Citrate base) to stabilize 
nucleic acid strand interaction, a mild detergent such as 0.5% SDS, together with 
other typical additives such as Denhardt's solution and salmon sperm DNA. The 

20 solution is added to the immobilized sequence to be probed and incubated at suitable 
temperatures to preferably permit specific binding while minimizing nonspecific 
binding. The temperature of the incubations and ensuing washes is critical to the 
success and clarity of the hybridization. Stringent conditions employ relatively higher 
temperatures, lower salt concentrations, and/or more detergent than do non-stringent 

25 conditions. Hybridization temperatures also depend on the length, complementarity 
level, and nature (ie, "GC content") of the sequences to be tested. Typical stringent 
hybridizations and washes are conducted at temperatures of at least 40°C, while lower 
stringency hybridizations and washes are typically conducted at 37°C down to room 
temperature (~25°C). One of skill in the art is aware that these conditions may vary 

30 according to the parameters indicated above, and that certain additives such as 
formamide and dextran sulphate may also be added to affect the conditions. 

By "stringent hybridization conditions" is meant hybridization conditions at 
least as stringent as the following: hybridization in 50% formamide, 5X SSC, 50 mM 
NaH 2 P0 4 , pH 6.8, 0.5% SDS, 0.1 mg/mL sonicated salmon sperm DNA, and 5X ... 

35 Denhart's solution at 42°C overnight; washing with 2X SSC, 0. 1 % SDS at 45°G; and 
washing with 0.2X SSC, 0.1% SDS at 45°C. 
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In sequence comparison analyses, an ORF, or motif, or set of motifs in a 
bacteriophage sequence can be compared to known inhibitor sequences, e.g., 
homologous sequences encoding homologous inhibitors of bacterial function. 
Likewise, the analysis can include comparison with the structure of essential bacterial 
5 gene products, as structural similarities can be indicative of similar or replacement 
biological function. Such analysis can include the identification of a signature, or 
characteristic motif(s) of an inhibitor or inhibitor class. 

Also, the identification of structural motifs in an encoded product, based on 
nucleotide or amino acid sequence analysis, can be used to infer a biochemical 

10 function for the product. A database containing identified structural motifs in a large 
number of sequences is available for identification of motifs in phage sequences. The 
database is PROSITE, which is available at www.expasy.ch/cgi-bin/scanprosite. The 
identification of motifs can, for example, include the identification of signature motifs 
for a class or classes of inhibitory proteins. Other such databases may also be used. 

15 In aspects and preferred embodiments described herein, in which a bacterium 

or host bacterium is specified, the bacterium or host bacterium is preferably selected 
from a pathogenic bacterial species, for example, one selected from Table 1. 
Preferably, an animal or plant pathogen is used. For animals, preferably the bacterium 
is a bird or mammalian pathogen, still more preferably a human pathogen. 

20 In aspects and preferred embodiments involving a bacteriophage or sequences 

from a bacteriophage, one or more bacteriophage are preferably selected from those 
listed in Table 1. Those exemplary bacteriophge are readily obtained from the 
indicated sources. 

In some cases, it is advantageous to utilize phage with non-pathogenic host 
25 bacteria. The genome, structural motif, ORF, homolog, and other analyses described 
herein can be performed on such phage and bacteria. Such analysis provides useful 
information and compositions. The results of such analyses can also be utilized in 
aspects of the present invention to identify homologous ORFs, especially inhibitor 
ORFs in phage with pathogenic bacterial hosts. Similarly, identification of a target in 
30 a non-pathogenic host can be used to identify homologous sequences and targets in 
pathogenic bacteria, especially in genetically closely related bacteria. Those skilled in 
the art are familiar with bacterial genetic relationships and with how to determine 
relatedness based on levels of genomic identity or other measures of nucleotide 
sequence and/or amino acid sequence similarity, and/or other physical and culture.. 
35 characteristics such as morphology, nutritional requirements, or minimal media^o 
support growth. 
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Also in preferred embodiments, an embodiments of this aspect is combined 
with an embodiment of the following aspect. 

A related aspect of the invention provides methods for identifying a target for 
antibacterial agents by identifying the bacterial target(s) of at least one 
5 uncharacterized or untargeted inhibitor protein or RNA from a bacteriophage. Such 
identification allows the development of antibacterial agents active on such targets. 
Preferred embodiments for identifying such targets involve the identification of 
binding of target and phage ORF products to one another. The phage ORF products 
may be subportions of a larger ORF product that also binds the host target. In 
10 preferred embodiments, the phage protein or RNA is from an uncharacterized 
bacteriophage in Table 1. This aspect preferably includes the identification of a 
plurality of such targets in one or a plurality of different bacteria, preferably in one or 
a plurality of bacteria listed in Table 1 . 

In preferred embodiments of this aspect and other aspects of this invention 
15 involving particular phage ORFs or phage sequences, the ORF is Staphylococcus 
aureus phage 77 ORF 17, 19, 43, 102, 104, or 182 as identified in U.S. application 
09/407,804, S. aureus phage 44AHJD ORF 1, 9, or 12, Streptococcus pneumoniae 
phage Dp-1 ORF 001, 002, 004, 008, 010, 013, 016, 021, 029, 030, 038, or 041, or 
Enterococcus sp. phage 182 ORF 002, 008, or 014. 
20 As indicated for the above aspect, preferably the method involves the use of a 

plurality of different phage, and thus a plurality of different phage inhibitors and/or 
inhibitor ORFs. 

In addition to uncharacteized phage ORF products, it is also useful to identify 
the targets of phage ORF products which are known to be inhibitors of host bacteria, 
25 but where the target has not been identified. Thus, such inhibitors can likewise be 
utilized as "untargeted" inhibitor phage ORFs and ORF products, e.g., proteins or 
RNAs. 

In the context of inhibitor proteins or RNAs from a phage, the term 
"uncharacterized" means that a bacteria-inhibiting function for the protein has not 

30 previously been identified. Preferably, but not necessarily, the sequence of the protein 
or the corresponding coding region or ORF was not described in the art before the 
filing of the present application for patent (or alternatively prior to the present 
invention). Thus, this term specifically excludes any bacteria-inhibiting phage protein 
and its associated bacterial target which has been identified as inhibitory before the 

35 present invention or alternatively before the filing of the present application, for. 

example those identified in Tables 12-14 or otherwise identified herein. For example, 
from E. coli, phage T7 genes 0.7 and 2.0 target the host RNA polymerase, phage T4 
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gp55/gp33 alter the specificity of host RNA polymerase. The T4 regB gene product 
also targets the host translation apparatus. As with the uncharacterized bacteriophage 
ORFs or bacteriophage above, for such identified proteins, the sequences encoding 
those proteins are excluded from the uncharacterized inhibitor proteins. 
5 The term "fragment" refers to a portion of a larger molecule or assembly. For 

proteins, the term "fragment" refers to a molecule which includes at least 5 contiguous 
amino acids from the reference polypeptide or protein, preferably at least 8, 10, 12, 
15, 20, 30, 50 or more contiguous amino acids. In connection with oligo- or 
polynucleotides, the term "fragment" refers to a molecule which includes at least 15 

10 contiguous nucleotides from a reference polynucleotide, preferably at least 24, 30, 36, 
45, 60, 90, 150, or more contiguous nucleotides. 

Preferred embodiments involve identification of binding that include methods 
for distinguishing bound molecules, for example, affinity chromatography, 
immunoprecipitation, crosslinking, and/or genetic screen methods that permit 

15 protein:protein interactions to be monitored. One of skill in the art is familiar with 
these techniques and common materials utilized (see, e.g., Coligan, J. et al. (eds.) 
(1995) Current Protocols in Protein Science. John Wiley & Sons, Secaucus, NJ.). 

Genetic screening for the identification of proteimprotein interactions typically 
involves the co-introduction of both a chimeric bait nucleic acid sequence (here, the 

20 phage ORF to be tested) and a chimeric target nucleic acid sequence that, when co- 
expressed and having affinity for one another in a host cell, stimulate reporter gene 
expression to indicate the relationship. A "positive" can thus suggest a potential 
inhibitory effect in bacteria. This is discussed in further detail in the Detailed 
Description section below. In this way, new bacterial targets can be identified that are 

25 inhibited by specific phage ORF products or derivatives, fragments, mimetics, or 
other molecules. 

Other embodiments involve the identification and/or utilization of mutant 
targets by virtue of their host's relatively unresponsive nature in the presence of 
expression of ORFs previously identified as inhibitory to the non-mutant or wild-type 

30 strain. Such mutants have the effect of protecting the host from an inhibition that 
would otherwise occur and indirectly allow identification of the precise responsible 
target for follow-up studies and anti-microbial development. In certain embodiments, 
rescue from inhibition occurs under conditions in which a bacterial target or mutant 
target is highly expressed. This is performed, for example, through coupling of the. 

35 sequence with regulatory element promoters, e.g., as known in the art, which regulate 
expression at levels higher than wild-type, e.g., at a level sufficiently higher that the 
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inhibitor can be competitively bound to the highly expressed target such that the 
bacterium is detectably less inhibited. 

Identification of the bacterial target can involve identification of a phage- 
specific site of action. This can involve a newly identified target, or a target where the 
5 phage site of action differs from the site of action of a previously known antibacterial 
agent or inhibitor. For example, phage T7 genes 0.7 and 2.0 target the host RNA 
polymerase, which is also the cellular target for the antibacterial agent, rifampin To 
the extent that a phage product is found to act at a different site than previously 
described inhibitors, aspects of the present invention can utilize those new, phage- 
10 specific sites for identification and use of new agents. The site of action can be 
identified by techniques well-known to those skilled in the art, for example, by 
mutational analysis, binding competition analysis, and/or other appropriate 
techniques. 

Once a bacterial host target protein or nucleic acid or mutant target sequence 

15 has been identified and/or isolated, it too can be conveniently sequenced, sequence 
analyzed (e.g., by computer), and the underlying gene(s), and corresponding translated 
product(s) further characterized. Preferred embodiments include such analysis and 
identification. Preferably such a target has not previously been identified as an 
appropriate target for antibacterial action. 

20 Certain embodiments include the identification of at least one inhibitory phage 

ORF or ORF product, e.g., as described for the above aspect, and thus are a 
combination of the two aspects. 

Additionally, the invention provides methods for identifying targets for 
antibacterial agents by identifying homologs of a bacterial target e.g., S. aureus, 

25 Enterococcus faecalis or other EnterococcU and Streptococcus pneumoniae of a 

bacteriophage inhibitory ORF product. Such homologs may be utilized in the various 
aspects and embodiments described herein as describded for the host Enterococcus sp. 
for bacteriophage 182. 

Other aspects of the invention provide isolated, purified, or enriched specific 

30 phage nucleic acid and amino acid sequences, subsequences, and homologs thereof for 
phage selected from uncharacterized phage listed in Table 1, preferably from 
bacteriophage 77, 3A, 96, 44AHJD {Staphylococcus aureus host bacterium), Dp-1 
{Streptococcus pneumoniae host), or 182 {Enterococcus host) or other phage listed in 
Table 1 for those bacteria. For example, such sequences do not include sequences 

35 identified in any of Tables 11-14. Nucleotide sequences of this aspect are at least" 15 
nucleotides in length, preferably at least 18, 21, 24, or 27 nucleotides in length, more 
preferably at least 30, 50, or 90 nucleotides in length. In certain embodiments, longer 
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nucleic acids are preferred, for example those of at least 120, 150, 200, 300, 600, 900 
or more nucleotides. Such sequences can, for example, be amplification 
oligonucleotides (e.g., PCR primers), oligonucleotide probes, sequences encoding a 
portion or all of a phage-encoded protein, or a fragment or all of a phage-encoded 
protein. In preferred embodiments, the nucleic acid sequence contains a sequence 
which is within a length range with a lower length as specified above, and an upper 
length limit which is no more than 50, 60, 70, 80, or 90% of the length of the 
corresponding full-length ORF. The upper length limit can also be expressed in terms 
of the number of base pairs of the ORF (coding region). In preferred embodiments, 
the nucleic acid sequence is from Staphylococcus aureus phage 77 ORF 17, 19, 43, 
102, 104, or 182 as identified in U.S. application 09/407,804, S. aureus phage 44 
AHJD ORF 1, 9, or 12, Streptococcus pneumoniae phage Dp-1 ORF 001, 002, 004, 
008, 010, 013, 016, 021, 029, 030, 038, or 041, or Enterococcus sp. phage 182 ORF 
002, 008, or 014. 

As it is recognized that alternate codons will encode the same amino acid for 
most amino acids due to the degeneracy of the genetic code, the sequences of this 

aspect includes nucleic acid sequences utilizing such alternate codon usage for one or 
more codons of a coding sequence. For example, all four nucleic acid sequences 
GCT, GCC, GCA, and GCG encode the amino acid, alanine. Therefore, if for an 

amino acid there exists an average of three codons, a polypeptide of 100 amino acids 
in length will, on average, be encoded by 3 100 , or 5 x 10 47 , nucleic acid sequences. 
Thus, a nucleic acid sequence can be modified (e.g., a nucleic acid sequence from a 
phage as specified above) to form a second nucleic acid sequence encoding the same 
polypeptide as encoded by the first nucleic acid sequence using routine procedures 
and without undue experimentation. Thus, all possible nucleic acid sequences that 
encode the specified amino acid sequences are also fully described herein, as if all 
were written out in full, taking into account the codon usage, especially that preferred 
in the host bacterium. The alternate codon descriptions are available in common 
texbooks, for example, Stryer, BIOCHEMISTRY 3 rd ed., and Lehninger, 
BIOCHEMISTRY 3* ed., along wth many others. Codon preference tables for 
various types of organisms are available in the literature. Sequences with alternate 
codons at one or more sites can also be utilized in the computer-related aspects and 
embodiments herein. Because of the number of sequence variations involving 
alternate codon usage, for the sake of brevity, individual sequences are not separately 
listed herein. Instead the alternate sequences are described by reference to the natural 
sequence with replacement of one or more (up to all e.g., up to 3, 5, 10, 15, 20, 30, 40, 
50, or more) of the degenerate codons with alternate codons from the alternate codon 
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table (Table 6), or a modified table applicable to a particular organism that has 
differing codon usage, preferably with selection according to preferred codon usage 
for the normal host organism or a host organism in which a sequence is intended to be 
expressed. Those skilled in the art also understand how to alter the alternate codons to 
be used for expression in organisms where certain codons code differently than shown 
in the "universal" codon table. 

For amino acid sequences or polypeptides, sequences contain at least 5 peptide- 
linked amino acid residues, and preferably at least 6, 7, 10, 15, 20, 30, or 40, amino 
acids having identical amino acid sequence as the same number of contiguous amino 
acid residues in a particular phage ORF product. In some cases longer sequences may 
be preferred, for example, those of at least 50, 60, 70, 80, or 100 amino acids in 
length. In preferred embodiments, the amino acid sequence contains a sequence which 
is within a length range with a lower length as specified above, and an upper length 
limit which is no more than 50, 60, 70, 80, or 90% of the length of the corresponding 
full-length ORF product. The upper length limit can also be expressed in terms of the 
number of amino acid residues of the ORF product. In preferred embodiments, the 
amino acid sequence or polypeptide has bacteria-inhibiting function when expressed 
or otherwise present in a bacterial cell which is a host for the bacteriophage from 
which the sequence was derived. 

By "isolated" in reference to a nucleic acid is meant that a naturally occurring 
sequence has been removed from its normal cellular (e.g., chromosomal) environment 
or is synthesized in a non-natural environment (e.g., artificially synthesized). Thus, 
the sequence may be in a cell- free solution or placed in a different cellular 
environment. The term does not imply that the sequence is the only nucleotide chain 
present, but that it is essentially free (about 90-95% pure at least) of non-nucleotide 
material naturally associated with it, and thus is distinguished from isolated 
chromosomes. 

The term "enriched" means that the specific DNA or RNA sequence 
constitutes a significantly higher fraction (2-5 fold) of the total DNA or RNA present 
in the cells or solution of interest than in normal or diseased cells or in cells from 
which the sequence was originally taken. This could be caused by a person by 
preferential reduction in the amount of other DNA or RNA present, or by a 
preferential increase in the amount of the specific DNA or RNA sequence, or by a 
combination of the two. However, it should be noted that enriched does not imply- 
that there are no other DNA or RNA sequences present, just that the relative ameunf 
of the sequence of interest has been significantly increased. 
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The term "significant" is used to indicate that the level of increase is useful to 
the person making such an increase and an increase relative to other nucleic acids of 
about at least 2-fold, more preferably at least 5- to 10- fold or even more. The term 
also does not imply that there is no DNA or RNA from other sources. The other 
5 source DNA may, for example, comprise DNA from a yeast or bacterial genome, or a 
cloning vector such as pUCl 9. This term distinguishes from naturally occurring 
events, such as viral infection, or tumor type growths, in which the level of one 
mRNA may be naturally increased relative to other species of mRNA. That is, the 
term is meant to cover only those situations in which a person has intervened to 

10 elevate the proportion of the desired nucleic acid. 

It is also advantageous for some purposes that a nucleotide sequence be in 
purified form. The term "purified" in reference to nucleic acid does not require 
absolute purity (such as a homogeneous preparation). Instead, it represents an 
indication that the sequence is relatively more pure than in the natural environment 

1 5 (compared to the natural level, this level should be at least 2-5 fold greater, e.g., in 
terms of mg/mL). Individual clones isolated from a cDNA library may be purified to 
electrophoretic homogeneity. The claimed DNA molecules obtained from these 
clones could be obtained directly from total DNA or from total RNA. The cDNA 
clones are not naturally occurring, but rather are preferably obtained via manipulation 
20 of a partially purified naturally occurring substance (messenger RNA). The 

construction of a cDNA library from mRNA involves the creation of a synthetic 
substance (cDNA) and pure individual cDNA clones can be isolated from the 
synthetic library by clonal selection of the cells carrying the cDNA library. Thus, the 
process which includes the construction of a cDNA library from mRNA and isolation 
25 of distinct cDNA clones yields an approximately 10 6 -fold purification of the native 
message. Thus, purification of at least one order of magnitude, preferably two or 
three orders, and more preferably four or five orders of magnitude is expressly 
contemplated. 

The terms "isolated", "enriched", and "purified" as respect nucleic acids, 
30 above, may similarly be used to denote the relative purity and abundance of 

polypeptides ( multimers of amino acids joined one to another by ct-carboxyl:a-amino 
group (peptide) bonds). These, too, may be stored in, grown in, screened in, and 
selected from libraries using biochemical techniques familiar in the art. Such 
polypeptides may be natural, synthetic or chimeric and may be extracted using any of 
35 a variety of methods, such as antibody immunoprecipitation, other "tagging" 

techniques, conventional chromatography and/or electrophoretic methods. Some of 
the above utilize the corresponding nucleic acid sequence. 
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As indicated above, aspects and embodiments of the invention are not limited 

to entire genes and proteins. The invention also provides and utilizes fragments and 

portions thereof, preferably those which are "active" in the inhibitory sense described 

above. Such peptides or oligopeptides and oligo or polynucleotides have preferred 

5 lengths as specified above for nucleic acid and amino acid sequences from phage; 

corresponding recombinant constructs can be made to express the encoded same. 

Also included are homologous sequences and fragments thereof. 

Nucleic acid sequences of the present invention can be isolated using a method 

similar to those described herein or other methods known to those skilled in the art. 

10 In addition, such nucleic acid sequences can be chemically synthesized by well- 
known methods. Also, by having particular phage ORFs, e.g., the phage ORFs 
identified herein (e.g., anti-bacterial ORFs of the present invention, portions thereof, 
or oligonucleotides derived therefrom as described), other antimicrobial sequences 
from other bacteriophage sources can be identified and isolated using methods 

15 described here or other methods, including methods utilizing nucleic acid 
hybridization and/or computer-based sequence alignment methods. 

The invention also provides bacteriophage antimicrobial DNA segments from 
other phages based on nucleic acids and sequences hybridizing to the presently 
identified inhibitory ORF under high stringency conditions or sequences that are 

20 highly homologous. The bacteriophage segment from a specific phage, e.g., an 

antimicrobial DNA segment, can be used to identify a related segment from another 
unrelated phage based on stringent conditions of hybridization or on being a homolog 
based on nucleic acid and/or amino acid sequence comparisons. As with identified 
inhibitory sequences, such homologous coding sequences and products can be used as 

25 antimicrobials, to construct active portions or derivatives, to construct 
peptidomimetics, and to identify bacterial targets. 

The nucleotide and amino acid sequences identified herein are believed to be 
correct, however, certain sequences may contain a small percentage of errors, e.g., 1- 
5%. In the event that any of the sequences have errors, the corrected sequences can be 

30 readily provided by one skilled in the art using routine methods. For example, the 
nucleotide sequences can be confirmed or corrected by obtaining and culturing the 
relevant phage, and purifying phage genomic nucleic acids: A region or regions' M- *• 
interest can be amplified, e.g., by PCR from the appropriate genomic temp late7 using 
primers based on the described sequence. The amplified regions can then be 

35 sequenced using any of the available methods (e.g., a dideoxy termination method). 
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This can be done redundantly to provide the corrected sequence or to confirm that the 
described sequence is correct. Alternatively, a particular sequence or sequences can 
be identified and isolated as an insert or inserts in a phage genomic library and 
isolated, amplified, and sequenced by standard methods. Confirmation or correction 
5 of a nucleotide sequence for a phage gene provides an amino acid sequence of the 
encoded product by merely reading off the amino acid sequence according to the 
normal codon relationships and/or expressed in a standard expression system and the 
polypeptide product sequenced by standard techniques. The sequences described 
herein thus provide unique identification of the corresponding genes, coding 

10 sequences, and other sequences, allowing those sequences to be used in the various 
aspects of the present invention. 

In other aspects, the invention provides recombinant vectors and cells 
harboring at least one of the phage ORFs or portion thereof, or bacterial target 
sequences described herein. As understood by those skilled in the art, vectors may be 

15 provided in different forms, including, for example, plasmids, cosmids, and virus- 
based vectors. See, e.g.. Maniatis. T. et al. (1989) Molecular Cloning: A Laboratory 
Manual. Cold Spring Harbor University Press, Cold Spring, N.Y.; See also, Ausubel, 
F.M. et al. (eds.) (1994) Current Protocols in Molecular Biology . John Wiley & Sons, 
Secaucus, N.J. 

20 In preferred embodiments, the vectors will be expression vectors, preferably 

shuttle vectors that permit cloning, replication, and expression within bacteria. An 
"expression vector" is one having regulatory nucleotide sequences containing 
transcriptional and translational regulatory information that controls expression of the 
nucleotide sequence in a host cell. Preferably the vector is constructed to allow 

25 amplification from vector sequences flanking an insert locus. In certain embodiments, 
the expression vectors may additionally or alternativley support expression, and/or 
replication in animal, plant and/or yeast cells due to the presence of suitable 
regulatory sequences, e.g., promoters, enhancers, 3' stabilizing sequences, primer 
sequences, etc. In preferred embodiments, the promoters are inducible and specific 

30 for the system in which expression is desired, e.g., bacteria, animal, plant, or yeast. 
The vectors may optionally encode a "tag" sequence or sequences to facilitate protein 
purification. Convenient restriction enzyme cloning sites and suitable selective 
marker(s) are also optionally included. Such selective markers can be, for example, 
antibiotic resistance markers or markers which supply an essential nutritive growth 

35 factor to an otherwise deficient mutant host, e.g., tryptophan, histidine, or leucjneih 
the Yeast Two-Hybrid systems described below. 
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The term "recombinant vector" relates to a single- or double-stranded circular 
nucleic acid molecule that can be transfected into cells and replicated within or 
independently of a cell genome. A circular double-stranded nucleic acid molecule can 
be cut and thereby linearized upon treatment with appropriate restriction enzymes. An 
5 assortment of nucleic acid vectors, restriction enzymes, and the knowledge of the 
nucleotide sequences cut by restriction enzymes are readily available to those skilled 
in the art. A nucleic acid molecule encoding a desired product can be inserted into a 
vector by cutting the vector with restriction enzymes and ligating the two pieces 
together. Preferably the vector is an expression vector, e.g., a shuttle expression 
1 0 vector as described above. 

By " recombinant cell" is meant a cell possessing introduced or engineered 
nucleic acid sequences, e.g., as described above. The sequence may be in the form of 
or part of a vector or may be integrated into the host cell genome. Preferably the cell 
is a bacterial cell. 

15 In another aspect, the invention also provides methods for identifying and/or 

screening compounds "active on" at least one bacterial target of a bacteriophage 
inhibitor protein or RNA. Preferred embodiments involve contacting such a bacterial 
target or targets (e.g., bacterial target proteins) with a test compound, and determining 
whether the compound binds to or reduces the level of activity of the bacterial target 

20 (e.g., a bacterial target protein). Preferably this is done either in vivo (z.e., in a cell- 
based assay) or in vitro, e.g., in a cell-free system under approximately physiological 
conditions. 

The compounds that can be used may be large or small, synthetic or natural, 
organic or inorganic, proteinaceous or non-proteinaceous. In preferred embodiments, 
25 the compound is a peptidomimetic, as described herein, a bacteriophage inhibitor 
protein or fragment or derivative thereof, preferably an "active portion", or a small 
molecule. 

In preferred embodiments, the bacterial target is a target of a phage ORF 
identified herein, e.g., S. aureus phage 44AHJD ORF 1, 9, or 12, Streptococcus 

30 pneumoniae phage Dp-1 ORF 001, 002, 004, 008, 010, 013, 016, 021, 029, 030, 038, 
or 041, or Enterococcus sp. phage 182 ORF 002, 008, or 014. 

In particular embodiments, the methods include the identification of bacterial 
targets or the site of action of an inhibitor on a bacterial target as described above or 
otherwise described herein. 

35 In embodiments involving binding assays, preferably binding is to a fragment 

or portion of abacterial target protein, where the fragment includes less than 90%, 
80%, 70%, 60%, 50%, 40%, or 30% of an intact bacterial target protein. Preferably, 



WO 00/32825 



PCT/1B99/02040 



the at least one bacterial target includes a plurality of different targets of 
bacteriophage inhibitor proteins, preferably a plurality of different targets. The 
plurality of targets can be in or from a plurality of different bacteria, but preferably is 
from a single bacterial species. 
5 A "method of screening" refers to a method for evaluating a relevant activity 

or property of a large plurality of compounds (e.g., a bacteria-inhibiting activity), 
rather than just one or a few compounds. For example, a method of screening can be 
used to conveniently test at least 100, more preferably at least 1000, still more 
preferably at least 10,000, and most preferably at least 100,000 different compounds, 
10 or even more. 

In the context of this invention, the term "small molecule" refers to 
compounds having molecular mass of less than 2000 Daltons, preferably less than 
1500, still more preferably less than 1000, and most preferably less than 600 Daltons. 
Preferably but not necessarily, a small molecule is not an oligopeptide. 

15 In a related aspect or in preferred embodiments, the invention provides a 

method of screening for potential antibacterial agents by determining whether any of a 
plurality of compounds, preferably a plurality of small molecules, is active on at least 
one target of a bacteriophage inhibitor protein or RNA. Preferred embodiments 
include those described for the above aspect, including embodiments which involve 

20 determining whether one or more test compounds bind to or reduce the level of 
activity of a bacterial target, and embodiments which utilize a plurality of different 
targets as described above. 

The identification of bacteria-inhibiting phage ORFs and their encoded 
products also provides a method for identifying an active portion of such an encoded 

25 product. This also provides a method for identifying a potential antibacterial agent by 
identifying such an active portion of a phage ORF or ORF product. In preferred 
embodiments, the identification of an active portion involves one or more of 
mutational analysis, deletion analysis, or analysis of fragments of such products. The 
method can also include determination of a 3-dimensional structure of an active 

30 portion, such as by analysis of crystal diffraction patterns. In further embodiments, 
the method involves constructing or synthesizing a peptidomimetic compound, where 
the structure of the peptidomimetic compound corresponds to the structure of the 
active portion. In this context, "corresponds" means that the peptidomimetic 
compound structure has sufficient similarities to the structure of the active portion that 

35 the peptidomimetic will interact with the same molecule as the phage protein and 
preferably will elicit at least one cellular response in common which relates to the 
inhibition of the cell by the phage protein. 
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In preferred embodiments, the ORP or ORF product is or is derived or 
obtained from S. aureus phage 44AHJD ORF 1, 9, or 12, Streptococcus pneumoniae 
phage Dp-1 ORF 001, 002, 004, 008, 010, 013, 016, 021, 029, 030, 038, or 041, or 
Enterococcus sp. phage 182 ORF 002, 008, or 014 or product thereof. 

The methods for identifying or screening for compounds or agents active on a 
bacterial target of a phage-encoded inhibitor can also involve identification of a 
phage-specific site of action on the target. 

Preferably in the methods for identifying or screening for compounds active 
on such a bacterial target, the target is uncharacterized; the target is from an 
uncharacterized bacterium from Table 1; the site of action is a phage-specfic site of 
action. 

Further embodiments include the identification of inhibitor phage ORFs and 
bacterial targets as in aspects above. 

An "active portion" as used herein denotes an epitope, a catalytic or regulatory 
domain, or a fragment of a bacteriophage inhibitor protein that is responsible for, or a 
significant factor in, bacterial target inhibition. The active portion preferably may be 
removed from its contiguous sequences and, in isolation, still effect inhibition. 

By "mimetic" is meant a compound structurally and functionally related to a 
reference compound that can be natural, synthetic, or chimeric. In terms of the present 
invention, a "peptidomimetic," for example, is a compound that mimics the activity- 
related aspects of the 3-dimensional structure of a peptide or polyeptide in a non- 
peptide compound, for example mimics the structure of a peptide or active portion of 
a phage- or bacterial ORF-encoded polypeptide. 

A related aspect provides a method for inhibiting a bacterial cell by contacting 
the bacterial cell with a compound active on a bacterial target of a bacteriophage 
inhibitor protein or RNA, where the target was uncharacterized. In preferred 
embodiments, the compound is such a protein, or a fragment or derivative thereof; a 
structural mimetic, e.g., a peptidomimetic, of such a protein or fragment; a small 
molecule; the contacting is performed in vitro, the contacting is performed in vivo in 
an infected or at risk organism, e.g., an animal such as a mammal or bird, for example, 
a human, or other mammal described herein; the bacterium is selected from a genus 
and/or species listed in Table 1; the bacteriophage inhibitor protein is uncharacterized; 
the bacteriophage inhibitor protein is from an uncharacterized phage listed in Table 1 ; 
the phage inhibitor protein is from one of S. aureus phage 44AHJD ORF 1, 9, or 12, 
Streptococcus pneumoniae phage Dp-1 ORF 001, 002, 004, 008, 010, 013, 016^021" " 
029, 030, 038, or 041, or Enterococcus sp. phage 182 ORF 002, 008, or 014. 
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In the context of targets in this invention, the term "uncharacterized" means 
that the target was not recognized as an appropriate target for an antibacterial agent 
prior to the filing of the present application or alternatively prior to the present 
invention. Such lack of recognition can include, for example, situations where the 
5 target and/or a nucleotide sequence encoding the target were unknown, situations 
where the target was known, but where it had not been identified as an appropriate 
target or as an essential cellular component, and situations where the target was 
known as essential but had not been recognized as an appropriate target due to a belief 
that the target would be inaccessible or otherwise that contacting the cell with a 

10 compound active on the target in vitro would be ineffective in cellular inhibition, or 
ineffective in treatment of an infection. Methods described herein utilizing bacterial 
targets, e.g., for inhibiting bacteria or treating bacterial infections, can also utilize 
"uncharacterized target sites", meaning that the target has been previously recognized 
as an appropriate target for an antibacterial agent, but where an agent or inhibitor of 

1 5 the invention is used which acts at a different site than that at which the previously 
utilized antibacterial agent, Le., a phage-specific site. Preferably the phage-specific 
site has different functional characteristics from the previously utilized site. In the 
context of targets or target sites, the term "phage-specific" indicates that the target or 
site is utilized by at least one bacteriophage as an inhibitory target and is different 

20 from previously identified targets or target sites. 

In the context of this invention, the term "bacteriophage inhibitor protein" 
refers to a protein encoded by a bacteriophage nucleic acid sequence which inhibits 
bacterial function in a host bacterium. Thus, it is a bacteria-inhibiting phage product. 
In the context of this invention, the phrase "contacting the bacterial cell with a 

25 compound active on a bacterial target of a bacteriophage inhibitor protein" or 
equivalent phrases refer to contacting with an isolated, purified, or enriched 
compound or a composition including such a compound, but specifically does not rely 
on contacting the bacterial cell with an intact phage which encodes the compound. 
Preferably no intact phage are involved in the contacting. 

30 Related aspects provide methods for prophylactic or therapeutic treatment of a 

bacterial infection by administering to an infected, challenged or at risk organism a 
therapeutically or prophylactically effective amount of a compound active on a target 
of a bacteriophage inhibitor protein or RNA, or as described for the previous aspect. 
Preferably the bacterium involved in the infection or risk of infection produces the 

35 identified target of the bacteriophage inhibitor protein or alternatively produces-a 

homologous target compound. In preferred embodiments, the host organism is a plant 
or animal, preferably a mammal or bird, and more preferably, a human or other 
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mammal described herein. Preferred embodiments include, without limitation, those 
as described for the preceding aspect. 

Compounds useful for the methods of inhibiting, methods of treating, and 
pharmaceutical compositions can include novel compounds, but can also include 
5 compounds which had previously been identified for a purpose other than inhibition 
of bacteria. Such compounds can be utilized as described and can be included in 
pharmaceutical compositions. 

In preferred embodiments of this and other aspects of the invention utilizing 
bacterial target sequences of a bacteriophage inhibitory ORF product, the target 

10 sequence is encoded by a Staphylococcus nucleic acid coding sequence, preferably S. 
aureus, a Streptococcus nucleic acid coding sequence, preferably Streptococcus 
pneumoniae, or Enterococcus nucleic acid coding sequence. Possible target 
sequences are described herein by reference to sequence source sites. 

The amino acid sequence of a polypeptide target is readily provided by 

15 translating the corresponding coding region. For the sake of brevity, the sequences 
are not reproduced herein. For the sake of brevity, the sequences are described by 
reference to the GenBank entries instead of being written out in full herein. In cases 
where the TIGR or GenBank entry for a coding region is not complete, the complete 
sequence can be readily obtained by routine methods, e.g., by isolating a clone in a 

20 phage host genomic library, and sequencing the clone insert to provide the relevant 
coding region. The boundaries of the coding region can be identified by conventional 
sequence analysis and/or by expression in a bacterium in which the endogenous copy 
of the coding region has been inactivated and using subcloning to identify the 
functional start and stop codons for the coding region. 

25 In the context of nucleic acid or amino acid sequences of this invention, the 

term "corresponding" indicates that the sequence is at least 95% identical, preferably 
at least 97% identical, and more preferably at least 99% identical to a sequence from 
the specified phage genome, a ribonucleotide equivalent, a degenerate equivalent 
(utilizing one or more degenerate codons), or a homologous sequence, where the 

30 homolog provides functionally equivalent biological function. 

By "treatment" or "treating" is meant administering a compound or 
pharmaceutical composition for prophylactic and/or therapeutic purposes. The term 
"prophylactic treatment" refers to treating a patient or animal that is not yet infected 
but is susceptible to or otherwise at risk of a bacterial infection. The term "therapeutic 

35 treatment" refers to administering treatment to a patient already suffering fronu 
infection. 
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The term "bacterial infection" refers to the invasion of the host organism, 
animal or plant, by pathogenic bacteria. This includes the excessive growth of bacteria 
which are normally present in or on the body of the organism, but more generally, a 
bacterial infection can be any situation in which the presence of a bacterial 
5 population(s) is damaging to a host organism. Thus, for example, an organism suffers 
from a bacterial population when excessive numbers of a bacterial population are 
present in or on the organism's body, or when the effects of the presence of a bacterial 
population(s) is damaging to the cells, tissue, or organs of the organism. 

The terms "administer'*, "administering", and "administration" refer to a 

10 method of giving a dosage of a compound or composition, e.g., an antibacterial 

pharmaceutical composition, to an organism. Where the organism is a mammal, the 
method is, e.g., topical, oral, intravenous, transdermal, intraperitoneal, intramuscular, 
or intrathecal. The preferred method of administration can vary depending on various 
factors, e.g., the components of the pharmaceutical composition, the site of the 

15 potential or actual bacterial infection, the bacterium involved, and the infection 
severity. 

The term "mammal" has its usual biological meaning referring to any 
organism of the Class Mammalia of higher vertebrates that nourish their young with 
milk secreted by mammary glands, e.g., mouse, rat, and, in particular, human, bovine, 

20 sheep, swine, dog, and cat. 

In the context of treating a bacterial infection a "therapeutically effective 
amount" or "pharmaceutically effective amount" indicates an amount of an 
antibacterial agent, e.g., as disclosed for this invention, which has a therapeutic effect. 
This generally refers to the inhibition, to some extent, of the normal cellular 

25 functioning of bacterial cells that renders or contributes to bacterial infection. 
The dose of antibacterial agent that is useful as a treatment is a 
"therapeutically effective amount." Thus, as used herein, a therapeutically effective 
amount means an amount of an antibacterial agent that produces the desired 
therapeutic effect as judged by clinical trial results and/or animal models. This amount 

30 can be routinely determined by one skilled in the art and will vary depending on 
several factors, such as the particular bacterial strain involved and the particular 
antibacterial agent used. 

In connection with claims to methods of inhibiting bacteria and therapeutic or 
prophylactic treatments, "a compound active on a target of a bacteriophage inhibitor 

35 protein" or terms of equivalent meaning differ from administration of or contactwTth 
an intact phage naturally encoding the full-length inhibitor compound. While an 
intact phage may conceivably be incorporated in the present methods, the method at 
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least includes the use of an active compound as specified different from a full length 
inhibitor protein naturally encoded by a bacteriophage and/or a delivery or contacting 
method different from administration of or contact with an intact phage encoding the 
full-length protein. Similarly, pharmaceutical compositions described herein at least 
include an active compound different from a full-length inhibitor protein naturally 
encoded by a bacteriophage or such a full-length protein is provided in the 
composition in a form different from being encoded by an intact phage. Preferably 
the methods and compositions do not include an intact phage. 

In accord with the above aspects, the invention also provides antibacterial 
agents and compounds active on bacterial targets of bacteriophage inhibitor proteins 
or RNAs, where the target was uncharacterized as indicated above. As previously 
indicated, such active compounds include both novel compounds and compounds 
which had previously been identified for a purpose other than inhibition of bacteria. 
Such previously identified biologically active compounds can be used in 
embodiments of the above methods of inhibiting and treating. In preferred 
embodiments, the targets, bacteriophage, and active compound are as described herein 
for methods of inhibiting and methods of treating. Preferably the agent or compound 
is formulated in a pharmaceutical composition which includes a pharmaceutically 
acceptable carrier, excipient, or diluent. In addition, the invention provides agents, 
compounds, and pharmaceutical compositions where an active compound is active on 
an uncharacterized phage-specific site. 

In preferred embodiments, the target is as described for embodiments of 
aspects above. 

Likewise, the invention provides a method of making an antibacterial agent. 
The method involves identifying a target of a bacteriophage inhibitor polypeptide or 
protein or RNA, screening a plurality of compounds to identify a compound active on 
the target, and synthesizing the compound in an amount sufficient to provide a 
therapeutic effect when administered to an organism infected by a bacterium naturally 
producing the target. In preferred embodiments, the identification of the target and 
identification of active compounds include steps or methods and/or components as 
described above (or otherwise herein) for such identification. Likewise, the active 
compound can be as described above, including fragments and derivatives of phage 
inhibitor proteins, peptidomimetics, and small molecules. As recognized by those 
skilled in the art, peptides can be synthesized by expression systems and purified, or 
can be synthesized artificially. In preferred embodiments the inhibitory phage ORF~ 
products is from S. aureus phage 44AHJD ORF 1, 9, or 12, Streptococcus 
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pneumoniae phage Dp- 1 ORF 001, 002, 004, 008, 010, 013, 016, 021, 029, 030, 038, 
or 041, or Enterococcus sp. phage 182 ORF 002, 008, or 014. 

As indicated above, sequence analysis of nucleotide and/or amino acid 
sequences can beneficially utilize computer analysis. Thus, in additional aspects the 
5 invention provides computer-related hardware and media and methods utilizing and 
incorporating sequence data from uncharacterized phage, e.g., uncharacterized phage 
listed in Table 1, preferably at least one of Staphylococcus aureus phage S. aureus 
phage 44AHJD ORF 1, 9, or 12, Streptococcus pneumoniae phage Dp-1 ORF 001, 
002, 004, 008, 010, 013, 016, 021, 029, 030, 038, or 041, or Enterococcus sp. phage 
10 182 ORF 002, 008, or 014, or 44 AHJD, Enterococcus sp. phage 182, or 

Streptococcus pneumoniae phage Dp-1 . In general, such aspects can facilitate the 
above-described aspects. Various embodiments involve the analysis of genetic 
sequence and encoded products, as applied to the evaluating bacteriophage inhibitor 
ORFs and compounds and fragments related thereto. The various sequence analyses, 

1 5 as well as function analyses, can be used separately or in combination, as well as in 
preceding aspects and embodiments. Use in combination is often advantageous as the 
additional information allows more efficient prioritizing of phage ORFs for 
identification of those ORFs that provide bacteria-inhibiting function. 

In one aspect, the invention provides a computer-readable device which 

20 includes at least one recorded amino acid or nucleotide sequence corresponding to one 
of the specified phage and a sequence analysis program for analyzing a nucleotide 
and/or amino acid sequence. The device is arranged such that the sequence 
information can be retrieved and analyzed using the analysis program. The analysis 
can identify, for example, homologous sequences or the indicated %s of the phage 

25 genome and structural motifs. Preferably the sequence includes at least 1 phage ORF 
or encoded product, more preferably at least 10%, 20%, 30%, 40%, 50%, 70%, 90%, 
or 100% of the genomic phage ORFs and/or equivalent cDNA, RNA, or amino acid 
sequences. Preferably the sequence or sequences in the device are recorded in a 
medium such as a floppy disk, a computer hard drive, an optical disk, computer 

30 random access memory (RAM), or magnetic tape. The program may also be recorded 
in such medium. The sequences can also include sequences from a plurality of 
different phage. 

In this context, the term "corresponding" indicates that the sequence is at least 
95% identical, preferably at least 97% identical, and more preferably at least 99% 
35 identical to a sequence from the specified phage genome, a ribonucleotide equivalent, 
a degenerate equivalent (utilizing one or more degenerate codons), or a homologous 
sequence, where the homolog provides functionally equivalent biological function. 
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Similarly, the invention provides a computer analysis system for identifying 
biologically important portions of a bacteriophage genome. The system includes a 
data storage medium, e.g., as identified above, which has recorded thereon a 
nucleotide sequence corresponding to at least a portion of at least one uncharacterized 
5 bacteriophage genome, a set of program instructions to allow searching of the 
sequence or sequences to analyze the sequence, and an output device where the 
portion includes at least the sequence length as specified in the preceding aspect. The 
output device is preferably a printer, a video display, or a recording medium. More 
one than one output device may be included. For each of the present computer-related 

10 asepcts, the bacteriophage are preferably selected from the uncharacterized phage 
listed in Table 1 , more preferably from bacteriophage 77, 3 A, 96, 44 AHJD (S. 
aureus), Dp-1 (Streptococcus pneumoniae), or 182 (Enter ococcus). 

In keeping with the computer device aspects, the invention also provides a 
method for identifying or characterizing a bacteriophage ORF by providing a 

15 computer-based system for analyzing nucleotide or amino acid sequences, e.g., as 
describe above. The system includes a data storage medium which has recorded a 
sequences or sequences as described for the above devices, a set of instructions as in 
the preceding aspect, and an output device as in the preceding aspect. The method 
further involves analyzing at least one sequence, and outputting the analysis results to 

20 at least one output device. 

In preferred embodiments, the analysis identifies a sequence similarity or 
homology with a sequence or sequences selected from bacterial ORFs encoding 
products with related biological function; ORFs encoding known inhibitors; and 
essential bacterial ORFs. Preferably the analysis identifies a probable biological 

25 function based on identification of structural elements or characteristic or signature 
motifs of an encoded product or on sequence similarity or homology. Preferably the 
uncharacterized bacteriophage is from Table 1, more preferably at least one of 
bacteriophage 77, 3 A, 96, 44 AHJD (S. aureus), Dp-1 (Streptococcus pneumoniae), or 
182 (Enter ococcus). In preferred embodiments, the method also involves determining 

30 at least a portion of the nucleotide sequence of at least one uncharacterized 

bacteriophage as indicated, and recording that sequence on data storage medium of the 
computer-based system. In preferred embodiments, the analysis identifies a sequence 
similarity of homology with a S. aureus phage 44 AHJD ORF 1, 9, or 12, 
Streptococcus pneumoniae phage Dp-1 ORF 001, 002, 004, 008, 010, 013, 016, 021, 

35 029, 030, 038, or 041, or Enterococcus sp. phage 182 ORF 002, 008, or 014. „ ~~ * 
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As used in the claims to describe the various inventive aspects and 
embodiments, "comprising" means including, but not limited to, whatever follows the 
word "comprising". Thus, use of the term "comprising" indicates that the listed 
elements are required or mandatory, but that other elements are optional and may or 
5 may not be present. By "consisting of is meant including, and limited to, whatever 
follows the phrase "consisting of. Thus, the phrase "consisting of indicates that the 
listed elements are required or mandatory, and that no other elements may be present. 
By "consisting essentially of is meant including any elements listed after the phrase, 
and limited to other elements that do not interfere with or contribute to the activity or 

10 action specified in the disclosure for the listed elements. Thus, the phrase "consisting 
essentially of indicates that the listed elements are required or mandatory, but that 
other elements are optional and may or may not be present depending upon whether or 
not they affect the activity or action of the listed elements. 

Further embodiments will be apparent from the following Detailed Description 

1 5 and from the claims. 



BRIEF DESCRIPTION OF THE DRAWINGS 

20 FIGURE 1 A and IB are flow schematics showing the manipulations used to 

convert pT0021, an arsenite inducible vector containing the luciferase gene, into 
pTHA or pTM, two ars inducible vectors. Vector pTHA contains BamH I, Sal I, and 
Hind III cloning sites and a downstream HA epitope tag. Vector pTM contains Bam 
HI and Hind III cloning sites and no HA epitope tag. 

25 

FIGURE 2 is a schematic representation of the cloning steps involved to place 
the DNA segments of any of ORFs 17/ 19/ 43/ 102/104/182 or other sequences into 
pTHA to assess inhibitory potential. For subcloning into pTM or pT0021, Individual 
ORFs were amplified by the PCR using oligonucleotides targeting the ATG and stop 
30 codons of the ORFs. Using this strategy, Bam HI and Hind III sites were positioned 
immediately upstream or downstream, respectively of the start and stop codons of 
each ORF. Following digestion with Bam HI and Hind III, the PCR fragments were_ . 
subcloned into the same sites of pT0021 or pTM. Clones were verified by PCR*and 
direct sequencing. 
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FIGURE 3 shows a schematic representation of the functional assays used to 
characterize the bactericidal and bacteriostatic potential of all predicted ORFs (>33 
amino acids) encoded by bacteriophage 77. Fig. 3 A) Functional assay on semi-solid 
support media. Fig. 3B) Functional assay in liquid culture. 

FIGURE 4A, B, and C is a bar graph showing the results of a screen in liquid 
media to assess bacteriostatic or bactericidal activity of 93 predicted ORFs (>33 
amino acids) encoded by bacteriophage 77. Growth inhibition assays were performed 
as detailed in the Detailed Description. The relative growth of Staphylococcus aureus 
transformants harboring a given bacteriophage 77 ORF (identified on the bottom of 
the graph), in the absence or presence of arsenite, is plotted relative to growth of a 
Staphylococcus aureus transformant containing ORF 5, a non-toxic bacteriophage 77 
ORF (which is set at 100%). Each bar represents the average obtained from three 
Staph A transformants grown in duplicate. Bacteriophage 77 ORFs showing 
significant growth inhibition consist of ORFs 17, 19, 102, 104, and 182. 

FIGURE 5 shows a block diagram of major components of a general purpose 
computer. 

FIGURE 6 shows an ORF map for Streptococcus pneumoniae bacteriophage 
Dp-1 showing the ORF identifiers, genomic locations, and orientations of the 85 
identified ORFs that were found to have ribosomal binding sites and thus are expected 
to be expressed. 

FIGURE 7 shows a schematic representation of the arsenite-inducible 
expression system present in a shuttle vector designed to express individual 
Streptococcus bacteriophage Dp-1 ORFs in Streptococcus. Various modifications can 
be readily made to such a vector, or other vectors can be readily constructed to 
provide inducible expression of ORFs in a particular host bacterium using well-known 
techniques. 
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DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

The invention may be more clearly understood from the following description. 
5 The tables will first be briefly described. 

Table 1 is a listing of a large number of available bacteriophage that can be 
readily obtained and used in the present invention. 

Table 2 shows the complete nucleotide sequence of the genome of 
Staphylococcus aureus bacteriophage 77. 
10 Table 3 shows a list of all the ORFs from Bacteriophage 77 that were screened 

in the functional assay to identify those with anti-microbial activity. 

Table 4 shows the predicted nucleotide sequence, predicted amino acid 
sequence, and physiochemical parameters of ORF 17/ 19/ 43/ 102/ 104/ 182]. These 
include the primary amino acid sequence of the predicted protein, the average 
15 molecular weight, amino acid composition, theoretical pi, hydrophobicity map, and 
predicted secondary structure map. 

Table 5 shows homology search results. BLAST analysis was performed with 
ORFs 17/19/ 43/ 102/ 104/ 182 against NCBI non-redundant nucleotide and 
Swissprot databases. The results of this search indicate that: I) ORF 17 has no 
20 significant homology to any gene in the NCBI non-NCBI non-redundant nucleotide 
database, II) ORF 19 has significant homology to one gene in the NCBI non- 
redundant nucleotide database - the gene encoding ORF 59 of bacteriophage phi PVL, 
III) ORF 43 has significant homology to one gene in the NCBI non-redundant 
nucleotide database - the gene encoding ORF 39 of phi PVL, IV) ORF 102 has 
25 significant homology to one gene in the NCBI non-redundant nucleotide database - 
the gene encoding ORF 38 of phi PVL, V) ORF 104 has no significant homology to 
any gene in the NCBI non-redundant nucleotide database, VI) ORF 182 has 
significant homology to one gene in the NCBI non-redundant nucleotide database - 
the gene encoding ORF 39 of phi PVL. 
30 Table 6 is a table from Alberts et al., MOLECULAR BIOLOGY OF THE 

CELL 3 rd ed., showing the redundancy of the "universal" genetic code. " ~* 

Table 7 shows the complete nucleotide sequence of Staphylococcus aureus 
bacteriophage 3A. 



WO 00/32825 



PCT/IB99/02040 



Table 8 is a listing of the ORFs identified in Staphylococcus aureus 
bacteriophage 3A. 

Table 9 shows the complete nucleotide sequence of Staphylococcus aureus 
bacteriophage 96. 

5 Table 10 is a listing of the ORFs identified in Staphylococcus aureus 

bacteriophage 96. 

Table 1 1 is a listing of sequences deposited in the NCBI public database 
(GeneBank) for bacteriophage listed in Table 1. 

Table 12 is a listing of phage which encode a known lysis function , including 
10 the identified lysis gene. 

Table 13 is a listing of bacteriophage which encode holin genes, where holin 
genes encode proteins which form pores and eventually enable other enzymes to kill 
the host bacterium. 

Table 14 is a listing of bacteriophage which encode kil genes. 

15 Table 15 is a list of Staphylococcus aureus sequences identified by accession 

number which may include sequences from genes coding for target sequences for the 

phage 77-encoded antimicrobial proteins or peptides. The sequences were obtained 

by searching GenBank for listings. 

Table 16 shows the nucleotide sequence of the genome of Staphylococcus 

20 aureus phage 44 AHJD. 

Table 17 lists and shows the sequence position of the 73 ORFs predicted to be 

encoded by Staphylococcus aureus bacteriophage 44 AHJD that are greater than 33 

amino acids. 

Table 18 shows the ORF sequences and putative amino acid sequences for the 
25 Staphylococcus aureus bacteriophage 44AHJD ORFs greater than 33 amino acids. 

Table 19 shows the similarities in sequence identified between predicted 
Staphylococcus aureus bacteriophage 44 AHJD ORFs and sequences present in public 
databases. 

Table 20 shows the homology alignments between predicted Staphylococcus 
30 aureus bacteriophage 44AHJD ORFs and the corresponding protein sequences present 
in public sequence databases. 

Table 21 shows the complete nucleotide sequence of the genome of 
Enterococcus bacteriophage 182. * - 

Table 22 lists and shows the sequence position of the 80 ORFs identified in 
35 bacteriophage 182 and that are greater than 33 amino acids. 
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Table 23 shows the nucleotide and predicted amino acid sequence of all 80 
ORFs identified in bacteriophage 182. 

Table 24 shows the similarities identified to date in sequence between 
Enterococcus phage 182 ORFs greater than 33 amino acids and sequences present in 
5 public sequence databases. 

Table 25 shows the predicted amino acid sequence as well as the predicted 
secondary structures map for two Enterococcus bacteriophage 182 ORFs. 

Table 26 shows the homology alignments between predicted Enterococcus 
bacteriophage 182 ORFs and the corresponding protein sequences present in public 
1 0 sequence databases. 

Table 27 list Enterococcus sequences listed in GenBank providing possible 
Enterococcal target sequences for inhibitory Enterococcus bacteriophage 182 ORFs 
and other compounds with antibacterial activity. 

Table 28 shows the complete nucleotide sequence of the genome of 
1 5 Streptococcus bacteriophage Dp- 1 . 

Table 29 lists and shows sequence position of the 273 ORFs identified in 
Pneumococcal bacteriophage Dp-1 that are greater than 33 amino acids, 85 of which 
are predicted to be expressed in Dp-1 as having a ribosomal binding site. That set of 
85 ORFs is shown in the attached drawings. 
20 Table 30 shows the nucleotide and predicted amino acid sequence of all 273 

ORFs identified in bacteriophage Dp-1 that are identified as being expressed. 

Table 31 shows the similarities identified in sequence between Streptococcus 
phage Dp-1 ORFs greater than 33 amino acids and sequences present in public 
sequence databases. 

25 Table 32 shows the 473 1 bp sequence of Dp-1 published by Sheehan et al., 

1997). 

Table 33 lists Streptococcus pneumoniae sequences listed in GenBank 
providing possible target sequences for inhibitory Streptococcus pneumoniae 
bacteriophage Dp-1 ORFs and other compounds with antibacterial activity 

30 

Background: 

As indicated above, the present invention is concerned, in part, with the use of 
bacteriophage coding sequences and the encoded polypeptides or RNA transcripts to , 
identify bacterial targets for potential new antibacterial agents. Thus, the invention 
35 concerns the selection of relevant bacteria. Particularly relevant bacteria are those 
which are pathogens of a complex organism such as an animal, e.g., mammals, 
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reptiles, and birds, and plants. Examples include Stapylococcus aureus, Enterococcus 
species, and Streptococcus pneumoniae. However, the invention can be applied to 
any bacterium (whether pathogenic or not) for which bacteriophage are available or 
which are found to have cellular components closely homologous to components 
5 targeted by phage of another bacterium. 

Thus, the invention also concerns the bacteriophage which can infect a 
selected bacterium. Identification of ORFs or products from the phage which inhibit 
the host bacterium both provides an inhibitor compound and allows identification of 
the bacterial target affected by the phage-encoded inhibitor. Such targets are thus 
10 identified as potential targets for development of other antibacterial agents or 

inhibitors and the use of those targets to inhibit those bacteria. As indicated above, 
even if such a target is not initially identified in a particular bacterium, such a target 
can still be identified if a homologous target is identified in another bacterium. 
Usually, but not necessarily, such another bacterium would be a genetically closely 
15 related bacterium. Indeed, in some cases, a phage-encoded inhibitor can also inhibit 
such a homologous bacterial cellular component. 

The demonstration that bacteriophage have adapted to inhibiting a host 
bacterium by acting on a particular cellular component or target provides a strong 
indication that that component is an appropriate target for developing and using 
20 antibacterial agents, e.g., in therapeutic treatments. Thus, the present invention 

provides additional guidance over mere identification of bacterial essential genes, as 
the present invention also provides an indication of accessability of the target to an 
inhibitor, and an indication that the target is sufficiently stable over time (e.g., not 
subject to high rates of mutation) as phage acting on that target were able to develop 
25 and persist. Thus, the present invention identifies a subset of essential cellular 

components which are particularly likely to be appropriate targets for development of 
antibacterial agents. 

The invention also, therefore, concerns the development or identification of 
inhibitors of bacteria, in addition to the phage-encoded inhibitory proteins (or RNA 
30 transcripts), which are active on the targets of bacteriophage-encoded inhibitors. As 
described herein, such inhibitors can be of a variety of different types, but are 
preferably small molecules. 

The following description provides preferred methods for use in the various 
aspects of the invention. However, as those skilled in the art will readily recognize, 
35 other approaches can be used to obtain and process relevant information. Thus-the ~ 
invention is not limited to the specifically described methods. In addition, the 
following description provides a set of steps in a particular order. That series of steps 
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describes the overall development involved in the present invention. However, it is 
clear that individual steps or portions of steps may be usefully practiced separately, 
and, further, that certain steps may be performed in a different order or even bypassed 
if appropriate information is already available or is provided by other sources or 
methods. 

Selecting and Growing Phage, and Isolating DNA 

Conceptually, the first step involves selecting bacterial hosts of interest. 
Preferably, but not necessarily, such hosts will be pathogens of clinical importance. 
Alternatively, because bacteria all share certain fundamental metabolic and structural 
features, these features can be targeted for study in one strain, for example a 
nonpathogenic one, and extrapolated to similarly succeed in pathogenic ones. 
Nonpathogenic strains may also exhibit initial advantages in being not only less 
dangerous, but also, for example, in having better growth and culturing characteristics 
and/or better developed molecular biology techniques and reagents. Consequently, 
advantageously the invention provides the ability target virtually any bacteria, but 
preferably pathogenic bacteria, with antimicrobial compounds designed and/or 
developed using bacteriophage inhibitory proteins and peptides from phage with non- 
pathogenic and/or pathogenic hosts. 

We have selected Staphylococcus aureus, Streptococcus pneumoniae, various 
Enterococci, and Pseudomonas aeruginosa as initial exemplary pathogens. These 
bacteria are a major cause of morbidity and mortality in hospital-based infections, and 
the appearance of antibiotics resistance in all three organisms makes it increasingly 
difficult to treat benign infections involving these organisms. Such infections can 
include, for example, otitis media, sinusitis, and skin, and airway infections (Neu, 
H.C. (1992). Science 257, 1064-1073). However, the approach described below is 
clearly applicable to any human bacterial pathogens including but not restricted to 
Mycobacterium tuberculosis, Nesseria gonorrhoeae, Haemophilus influenza, 
Acinobacter, Escherichia coli, Shigella dysenteria, Streptococcus pyogenes, 
Helicobacter pylori, and Mycoplasma species. This invention can also be applied to 
the discovery of anti-bacterial compounds directed against pathogens of animals other 
than humans, for example, sheep, cattle, swine, dogs, cats, birds, and reptiles. 
Similarly, the invention is not limited to animals, but also applies to plants and plant 
pathogens. 

In general, the bacteria are grown according to standard methodologies - 
employed in the art, including solid, semi-solid or liquid culturing, which procedures 
can be found in or extrapolated from standard sources such as Maloy, S.R., Stewart, 
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V.J., and Taylor, R.K. Genetic Analysis of Pathogenic Bacteria (1996) Cold Spring 
Harbor Laboratory Press, or Maniatis, T. et al. (1989) Molecular Cloning: A 
Laboratory Manual . Cold Spring Harbor University Press, Cold Spring, N.Y.; or 
Ausubel, F.M. et al. (1994) Current Protocols in Molecular Biology . John Wiley & 
5 Sons, Secaucus, N.J. Culture conditions are selected which are adapted to the 
particular bacterium generally using culture conditions known in the art as 
appropriate, or adaptations of those conditions. 

Nucleic acids within these bacteria can be routinely extracted through common 
procedures such as described in the above-referenced manuals and as generally known 
10 to those skilled in the art. Those nucleic acid stocks can then be used to practice the 
other inventive aspects described below. 

Selection and Growth of Bacteriophage, and Isolation of DNA 

The second step involves assembling a group of bacteriophages (phage 

15 collection) for one or more of the targeted bacterial hosts. While the invention can be 
utilized with a single bacteriophage for a pathogen or other bacterium, it is preferable 
to utilize a plurality of phage for each bacterium, as comparisons between a plurality 
of such phage provides useful additional information. Non-limiting examples of 
phage and sources for some of the above-mentioned pathogenic bacteria are found in 

20 Table 1. The criteria used to select such phages is that they are infectious for the 

microbe targeted, and replicate in, lyse, or otherwise inhibit growth of the bacterium 
in a measurable fashion. These phages can be very different from one another 
(representing different families), as judged by criteria such as morphology (head, tail, 
plate, etc.), and similarity of genome nucleotide sequence (cross-hybridization). Since 

25 such diverse bacteriophages are expected to block bacterial host metabolism and 
ultimately inhibit by a variety of mechanisms, their combined study will lead to the 
identification of different mechanisms by which the phages independently inhibit 
bacterial targets. Examples include degradation of host DNA (Parson K.A., and 
Snustad, D.P. (1975). J. Virol. 15, 221-444) and inhibition of host RNA transcription 

30 (Severinova, E., Severinov, K. and Darst, S.A. (1998;. J.Mol. Biol 279, 9-18). This, 
in turn, yields novel information on phage proteins that can inhibit the targeted 
microbe. As explained below, this 1) forms the basis of novel drug discovery efforts 
based on knowledge of the primary amino acid sequence of the phage inhibitor 
protein (e.g., peptide fragments or peptidomimetics) and/or 2) leads to the 

35 identification of bacterial biochemical pathways, the proteins of which are essentiaTor 
significant for survival of the targeted microbe, and which enzymatic steps or 
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chemical reactions can be targeted by classical drug discovery methods using 
molecular inhibitors, for example, small molecule inhibitors. 

Bacteriophage are generally either of two types, lytic or filamentous, meaning 
they either outright destroy their host and seek out new hosts after replication, or else 
5 continuously propogate and extrude progeny phage from the same host without 
destroying it. Regardless of the phage life cycle and type, preferred embodiments 
incorporate phage which impede cell growth in measurable fashion and preferably 
stop cell growth. To this end, lytic phage are preferred, although certain nonlytic 
species may also suffice, e.g., if sufficiently bacteriostatic. 

10 Various procedures that are commonly understood by those of skill in the art 

can be routinely employed to grow, isolate, and purify phage. Such procedures are 
exemplified by those found in such common laboratory aids such as Maloy, S.R., 
Stewart, VJ., and Taylor, R.K. Genetic Analysis of Pathogenic Bacteria (1996) Cold 
Spring Harbor Laboratory Press; Maniatis, T. et al. (1989) Molecular Cloning: A 

15 Laboratory Manual. Cold Spring Harbor University Press, Cold Spring, N.Y.; and 
Ausubel, F.M. et al. (eds.) (1994) Current Protocols in Molecular Biology . John 
Wiley & Sons, Secaucus, N.J. The techniques generally involve the culturing of 
infected bacterial cells that are lysed naturally and/or chemically assisted, for 
example, by the use of an organic solvent such as chloroform that destroys the host 

20 cells thereby liberating the phage within. Following this, the cellular debris is 

centrifuged away from the supernatant containing the phage particles, and the phage 
then subsequently and selectively precipitated out of the supernatant using various 
methods usually employing the use of alcohols and/or other chemical compounds 
such as polyethylene glycol (PEG). The resulting phage can be further purified using 

25 various density gradient/centrifiigation methodologies. The resulting phage are then 
chemically lysed, thereby releasing their nucleic acids that can be conveniently 
precipitated out of the supernatant to yield a viral nucleic acid supply of the phage of 
interest. 

Exemplary bacteriophage are indicated in Table 1, along with sources where 
30 those phage may be obtained. 

Exemplary bacteria include the reference bacteria for the identified 
bacteriophage, available from the same sources. 

Characterizing Bacteriophage Genomes for ORFs 
35 The third step involves systematically characterizing the genetic information 

contained in the phage genome. Within this genetic information is the sequence of all 
RNAs and proteins encoded by the phage, including those that are essential or 
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instrumental in inhibiting their host. This characterization is preferably done in a 
systematic fashion. For example, this can be done by first isolating high molecular 
weight genomic DNA from the phage using standard bacterial lysis methods, followed 
by phage purification using density gradient ultracentrifugation, and extraction of 
5 nucleic acid from the purified phage preparation. The high molecular weight DNA is 
then analyzed to determine its size and to evaluate a proper strategy for its sequencing. 
The DNA is broken down into smaller size fragments by sonication or partial 
digestion with frequently cutting restriction enzymes such as Sau3A to yield 
predominantly 1 to 2 kilobase length DNA, which DNA can then be resolved by gel 

10 electrophoresis followed by extraction from the gel. 

The ends of the fragments are enzymatically treated to render them suitable for 
cloning and the pools of fragments are cloned in a bacterial plasmid to generate a 
library of the phage genome. Several hundred of these random DNA fragments 
contained in the plasmid vector are isolated as clones after introduction into an 

15 appropriate bacterium, usually Escherichia coli. They are then individually expanded 
in culture and the DNA from each individual clone is purified. The nucleotide 
sequences of the inserts of these clones are determined by standard automated or 
manual methods, using oligonucleotide primers located on either side of the cloning 
site to direct polymerase mediated sequencing (e.g., the Sanger sequencing method or 

20 a modification of that method). Other sequencing methods can also be used. 

The sequence of individual clones is then deposited in a computer, and 
specific software programs (for example, Sequencher™, Gene Codes Corp.) are used 
to look for overlap between the various sequences, resulting in ordering of contig 
sequences and ultimately providing the complete sequence of the entire bacteriophage 

25 genome (one such example is given in Table 2 for Staphylococcus aureus 
bacteriophage 77; others are also provided herein). This complete nucleotide 
sequence is preferably determined with a redundancy of at least 3- to 5-fold (number 
of independent sequencing events covering the same region) in order to minimize 
sequencing errors. 

30 Preferably, the bacterial strain used as a phage host should not possess any 

other innate plasmids, transposons, or other phage or incompatible sequences that 
would complicate or otherwise make the various manipulations and analyses more 
difficult. 

Commercially available computer software programs are used to translate the 
35 nucleotide sequence of the phage to identify all protein sequences encoded by the " 
phage (hereafter called open reading frames or ORFs). (Customized software can 
clearly also be used.) As phages are known to transcribe their genome into RNA from 
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both strands, in both directions, and sometimes in more than one frame for the same 
sequence, this exercise is done for both strands and in all six possible reading frames. 
As evolutionary constraints have forced the phage to conserve all of its vital protein 
sequences in as small a genome as possible, it is straightforward to identify all the 
proteins encoded by the phage by simple examination of the 6 translation frames of 
the genome. Once these ORFs are identified, they are cataloged into a phage 
proteome database (Table 3 lists ORFs identified from phage 77; ORF lists are also 
provided for other exemplary phage). This analysis is preferably performed for each 
phage under study. The process of ORF identification can be varied depending on the 
desired results. For example, the minimum length for the putative encoded 
polypeptide can be varied, and/or putative coding regions that have an associated 
Shine-Dalgarno sequence can be selected. In the case of phage 77 ORFs, such 
parameter adjustment was performed and resulted in the identification of ORFs as 
listed herein. Different parameters had resulted in the identification of the ORFs 
listed in the preceding U.S. Provisional Application 60/1 10,992, filed December 3, 
1998, which is hereby incorporated by reference in its entirety. 

Exemplary phage 77 ORFs identified in that provisional application and as 
identified herein are shown in the following table: 



ORF ID 
from 

60/110,992 


Genomic 
position 


a.a. 
size 


Start 
codon 


ORF ID 

from 

241/190 


Genomic 
position 


a.a. 

size 


Start 
codon 


77ORF016 


2369-24024 


251 


TTG 


77ORF017 


23269-23982 


237 


ATG 


77ORF019 


39845-40501 


218 


ATA 


77ORF019 


39851-40501 


216 


ATG 


77ORF050 


29268-29564 


98 


ATG 


770RF182 


29268-29564 


98 


ATG 


77ORF050 


29268-29564 


98 


ATG 


77ORF043 


29304-29564 


86 


ATG 


77ORF067 


34312-34551 


79 


CTG 


77ORFI04 


34393-34551 


52 


ATG 


770RF146 


29051-29212 


53 


ATG 


77ORF102 


29051-29212 


53 


ATG 



Identifying and Characterizing Inhibitory Phage ORFs 

The fourth step entails identifying the phage protein or proteins or RNA 
transcripts that have the ability to inhibit their bacterial hosts. This can be 
accomplished, for example, by either or both of two non-mutually exclusive methods. 
The first method makes use of bioinformatics. Over the past few years, a large amount 
of nucleotide sequence information and corresponding translated products have 
become available through large genome sequencing projects for a variety of 
organisms including mammals, insects, plants, unicellular eukaryotes (yeast and ~~ * 
fungi), as well as several bacterial genomes such as E. coli, Mycobacterium 
tuberculosis, Bacillus subtilis, Staphylococcus aureus and many others. Such 
sequences have been deposited in public databases (for example, non-redundant 
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sequence database at GenBank and SwissProt protein sequence database) 
(http://www.ncbi.nlm.nih.gov)) and can be freely accessed to compare any specific 
query sequence to those present in such databases. For example, GenBank contains 
over 1.6 billion nucleotides corresponding to 2.3 million sequence records. Several 
5 computer programs and servers (e.g., TBLASTN) have been created to allow the rapid 
identification of homology between any given sequence from one organism to that of 
another present in such databases, and such programs are public and available free of 
charge. 

In addition, it has been well established that basic biochemical pathways can 

10 be conserved in very distant organisms (for example bacteria and man), and that the 
proteins performing the various enzymatic steps in these pathways are themselves 
conserved at the amino acid sequence level. Thus, proteins performing similar 
functions (e.g. DNA repair, RNA transcription, RNA translation) have frequently 
preserved key structural signatures, identifiable by similarities across regions of 

15 proteins (domains and motifs). The antimicrobials of the present invention will 
preferably target features and targets that are highly characteristic or conserved in 
microbes, and not higher organisms. 

Most genomes encode individual proteins or groups of proteins that can be 
assembled into protein families that have been evolutionary conserved. Therefore, 

20 similarity between a new query sequence and that of a member of a protein family 
(reference sequences from public databases) can immediately suggest a biochemical 
function for the novel query sequence, which in our case is a phage ORF. 

The sequence homology between individual members of evolutionarily distant 
members of a protein family is usually not randomly distributed along the entire 

25 length of the sequence but is often clustered into "motifs" and "domains". These 
correspond to key three-dimensional folds that form key catalytic and/or regulatory 
structures that perform key biochemical function(s) for the group of proteins. 
Commercially available computer software programs can identify such motifs in a 
new query sequence, again providing functional information for the query sequence. 

30 Such structural and functional motifs have also been derived from the combined 
analysis of primary sequence databases (protein sequences) and protein structure 
databases (X-ray crystallography, nuclear magnetic resonance) using so-called 
"threading" methods (Rost B,l and Sander C. (1996) Ann. Rev. Biophy. BiomoL 
Struct. 25,113-136). 

35 Such motifs and folds are themselves deposited in public databases which can 

be directly accessed (for example, SwissProt database; 3D-ALI at EMBL, Heidelberg; 
PROSITE). This basic exercise leads to a structural homology map in which each of 
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the phage ORFs has been probed for such similarities, and where initial structural and 
functional hits are identified (selected examples of sequence homologies detected 
between individual ORFs from the genome of Staphylococcus aureus bacteriophage 
77 and sequences deposited in public databases are shown in Table 5 for ORFs 
5 17/19/43/102/104/182). 

This analysis can point out phage proteins with similarity to proteins from 
other phages (such as those for E. coli) playing an important role in the basic 
biochemical pathways of the phage (such as DNA replication, RNA transcription, 
tRNAs, coat protein and assembly). Selected examples of such proteins include 

10 integrase and capsid protein. Therefore, this analysis enables identification and 

elimination of non-essential ORFs as candidates for an inhibitor function, as well as 
the identification of (potentially) useful ones. 

In addition, this analysis can point out specific ORFs as possible inhibitor 
ORFs. For example these ORFs may encode proteins or enzymes that alter bacterial 

15 cell structure, metabolism or physiology, and ultimately viability. Examples of such 
proteins present in the genome of Staphylococcus aureus bacteriophage 77 include 
orfl4 (deoxyuridine triphosphatase from bacteriophage T5), and orflS (sialidase). 
(These ORF identifications are as listed in provisional application 60/1 10,992.) Other 
examples include ORFs 9 and 12 of S. aureus phage 44 AHJD, which encode the 

20 putative lysis functions found in many bacteriophages - a "holin" and an "amidase". 

In addition, it is well known that bacterial and eukaryotic viruses can usurp 
pathways from their host in order to use them to their advantage in blocking host 
cellular pathways upon infection. The phage can achieve this by 1) directly producing 
an inhibitor of a key host pathway (e.g. T7 gene 0.5 and 2), 2) directly producing a 

25 novel activity (e.g. T4 DNA polymerase), and 3) altering concentrations of cell 
components by producing similar functions (e.g. T4 transfer RNAs). The 
identification of sequence similarity between phage ORFs and bacterial host genome 
sequences will be highly indicative of such a mechanism. (Selected examples of such 
homologies are listed in Figure 4 of the provisional application 60/1 10,992 and 

30 include orf4 (homologous to autolysin), orf20 (hypothetical protein from 

Staphyloccus aureus) and or£29 (hypothetical protein from Staphyloccus aureus.)) 
These ORFs can be analyzed by a standard biochemical approach to directly test their 
inhibitor functions (e.g., as described below). 

Alternatively, a homology search may reveal that a given phage ORF is related 

35 to a protein present in the databases having an activity known to be inhibitory, (e.g.~ 
inhibitor of host RNA polymerase by E. coli bacteriophage T7. Such a finding would 
implicate the phage ORF product in a related activity. This will also suggest that a 
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new antimicrobial could be derived by a mimetic approach (e.g., peptidomimetic) 
imitating this function or by a small molecule inhibitor to the bacterial target of the 
phage ORF, or any steps in the relevant host metabolic pathway, e.g., high throughput 
screening of small molecule libraries. Selected examples of such similarity between 
5 ORFs of Staphyloccus aureus bacteriophage 77 and proteins with inhibitor functions 
for bacterial hosts are listed in Figure 4 of the provisional application 60/1 10,992. 
These include orf9 (similar to bacteriophage PI kilA function), and orf4 (autolysin of 
Staphylococcus aureus, amidase enzymatic activity). 

A reason for the biochemical study of individual ORFs for inhibitor function is 

1 0 that their expression or overexpression will block cellular pathways of the host, 
ultimately leading to arrest and/or inhibition of host metabolism. In addition, such 
ORFs can alter host metabolism in different ways, including modification of 
pathogenicity. Therefore, individual ORFs identified above are expressed, preferably 
overexpressed, in the host and the effect of this expression or overexpression on host 

1 5 metabolism and viability is measured. This approach can be systematically applied to 
every ORF of the phage, if necessary, and does not rely on the absolute identification 
of candidate ORFs by bioinformatics. Individual ORFs are resynthesized from the 
phage genomic DNA, e.g., by the polymerase chain reaction (PCR), preferably using 
oligonucleotide primers flanking the ORF on either side. These single ORFs are 

20 preferably engineered so that they contain appropriate cloning sites at their extremities 
to allow their introduction into a new bacterial expression plasmid, allowing 
propagation in a standard bacterial host such as £. coli, but containing the necessary 
information for plasmid replication in the target microbe such as S. aureus (hereafter 
referred to as shuttle vector). Shuttle vectors and their use are well known in the art. 

25 Such shuttle vectors preferably also contain regulatory sequences that allow 

inducible expression of the introduced ORF. As the candidate ORF may encode an 
inhibitor function that will eliminate the host, it is beneficial that it not be expressed 
prior to testing for activity. Thus, screening for such sequences when expressed in a 
constitutive fashion is less likely to be successful when the inhibitor is lethal. In the 

30 exemplary inducible system presented in Figure 1A, IB, 2, and 7, regulatory 
sequences from the ars operon of S. aureus are used to direct individual ORF 
expression in S. aureus (or other bacteria in which the ars system is functional). The 
ars operon encodes a series of proteins which normally mediate the extrusion of 
arsenite and other trivalent oxyanions from the cells when they are exposed to such 

35 toxic substances in their environment. The operon encoding this detoxifying _ 

mechanism is normally silent and only induced when arsenite-related compounds are 
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present. (Tauriainen, S. et al. (1997) App. Env. Microb., Vol. 63, No. 1 1, p. 4456- 
4461.) 

Therefore, individual phage ORFs can be expressed in S. aureus in an 
inducible fashion by adding to the culture medium non-toxic arsenite concentrations 
5 during the growth of individual S. aureus clones expressing such individual phage 
ORFs. Toxicity of the phage inhibitor ORF for the host is monitored by reduction or 
arrest of growth under induction conditions, as measured by optical density in liquid 
culture or after plating the induced cultures on solid medium. Subsequently, 
interference of the phage ORF with the host biochemical pathways ultimately leading 
10 to reduced or arrested host metabolism can be measured by pulse-chase experiments 
using radiolabeled precursors of either DNA replication, RNA transcription, or protein 
synthesis. Similar constructs can be made and used for other bacteria using well- 
known techniques. 

Those skilled in the art are familiar with a variety of other inducible systems 
1 5 which can also be used for the controlled expression of phage ORFs, including, for 
example, lactose (see e.g., Stratagene's LacSwitch™H system; La Jolla, CA) and 
tetracycline-based systems (see, e.g. Clontech's Tet On/Tet Off™ system; Palo Alto, 
CA). The arsenite-inducible system described is further depicted in Figures 1, 2 and 7. 
The selection or construction of shuttle vectors and the selection and use of 
20 inducible systems are well known and thus other shuttle vectors appropriate for other 
bacteria can be readily provided by those skilled in the art, e.g., for use in other 
bacterial species. 

Standard methodologies for expressing proteins from constructs, and isolating 
and manipulating those proteins, for example in cross-linking and affinity 

25 chromatography studies, may be found in various commonly available and known 
laboratory manuals. See, e.g., Current Protocols in Protein Science. John Wiley & 
Sons, Secaucus, N.J., and Maniatis, T. et al. (1989) Molecular Cloning: A Laboratory 
Manual . Cold Spring Harbor University Press, Cold Spring, N.Y. 

It has been found that certain phage or other viruses inhibit host cells, at least 

30 in part, by producing an antisense RNA which binds to and inhibits translation from a 
bacterial RNA seqeunce. Thus, in the case of potentially inhibitor RNA transcripts 
encoded by the phage genome, a strong indicator of a possible inhibitory function is 
provided by the identification of phage sequence which is the identical to or fully 
complementary (or with only a small percentage of mismatch, e.g., <10%, preferably 

35 less than 5%, most preferably less than 3%, to a bacterial sequence. This approaches 
convenient in the case of bacteria that have been essentially completely sequenced, as 
the comparison can be performed by computer using public database information. 
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The inhibitory effect of the transcript can be confirmed using expression of the 
phage sequence in a host bacterium. If needed, such inhibitory can also be tested by 
transfecting the cells with a vector that will transcribe the phage sequence to form 
RNA in such manner that the RNA produced will not be translated into a polypeptide. 
5 Inhibition under such conditions provides a strong indication that the inhibition is due 
to the transcript rather than to an encoded polypeptide. 

In an alternative, the expression of an ORF in a host bacterium is found to be 
inhibitory, but the inhibition is found to be due to an RNA product of the genomic 
coding region. For antisense inhibition, the sequence of the bacterial target nucleic 

10 acid sequence can be identified by inspection of the phage sequence, and the fiill 
sequence of the relevant coding region for the bacterial product can be found from a 
database of the bacterial genomic sequence or can be isolated by standard techniques 
(e.g., a clone in a genomic library can be isolated which contains the fiill bacterial 
ORF, and then sequenced). 

15 In either case, the identification of a target which is inhibited by an RNA 

transcript produced by a phage provides both the possible inhibition of bacteria 
naturally containing the same target nucleic acid sequence, as well as the ability to use 
the target sequence in screening for other types of compounds which will act directly 
on the target nucleic acid sequence or on a polypeptide product expressed or 

20 regulated, at least in part, by the target of the inhibitory phage RNA. 

In some cases it will be found that the target of an inhibitory phage RNA or 
protein has previously been found to be a target of an inhibitory phage RNA or 
protein has previously been found to be a target for an antibacterial agent. In such 
cases, the phage inhibitor can still provide useful information if it is found that the 

25 phage-encoded product acts at a different site than the previously identified 

antibacterial agent or inhibitor, Le. y acts at a phage-specific site. For many targets, 
action at a different site provides highly beneficial characteristics and/or information. 
For example, an alternate site of inhibitor action can at least partially overcome a 
resistance mechanism in a bacterium. As an illustration, in many cases, resistance is 

30 due, in large part, to altered binding characteristics of the immediate target to the 
antibacterial agent. The altered binding is due to a structural change which prevents 
or destabilizes the binding. However, the structural change is frequently quite local, 
so that compounds which bind at different local sites will b unaffected or affected to a 
much lesser degree. Indeed, in some cases the local sites will be on a different 

35 molecule and so may be completely unaffected by the local structural change creating 
resistance to the original agent(s). An example of resistance due to altered binding is 
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provided by methicillin-resistant Staphylococcus aureus* in which the resistance is 
due to an altered penicillin-binding protein. 

In other cases, a new site of action can have improved accessibility as 
compared to a site acted on by a previously identified agent. This can, for example, 
5 assist in allowing effective treatment at lower doses, or in allowing access by a larger 
range of types of compounds, potentially allowing identification of more potential 
active agents. 

Another advantage is that the structural characteristics of a different site of 
action will lead to identification and/or development of inhibitors with different 
10 structures and different pharmacological parameter. This can allow a greater range of 
possibilities when selecting an antibacterial agent. 

Yet further, different sites often produce different inhibitory characteristics in 
the target organism. This is commonly the case for multi-domain target proteins. 
Thus, inhibition targeting an alternate site can produce more efficacious action, eg., 
1 5 faster killing, slower development of resistance, lower numbers of surviving cells, and 
different secondary effects (for example, different nutrient utilization). 

Staphylococcus aureus phage 77 

As indicated above, the present invention is concerned, in part, with the use of 

20 bacteriophage 77 coding sequences and the encoded polypeptides or RNA transcripts 
to identify bacterial targets for potential new antibacterial agents. 

As described, phage 77 ORFs 17, 19, 43, 102, 104, and 182 have been found 
to have bacteria inhibiting function. Identification of ORFs 17, 19, 43, 102, 104, and 
182 and products from the phage which inhibit the host bacterium both provides an 

25 inhibitor compound and allows identification of the bacterial target affected by the 
phage-encoded inhibitor. Such a target is thus identified as a potential target for 
development of other antibacterial agents or inhibitors and the use of those targets to 
inhibit those bacteria. As indicated above, even if such a target is not initially 
identified in a particular bacterium, such a target can still be identified if a 

30 homologous target is identified in another bacterium. Usually, but not necessarily, 
such another bacterium would be a genetically closely related bacterium. Indeed, in 
some cases, an inhibitor encoded by phage 77 ORF 17, 19, 43, 102, 104, or 182 can 
also inhibit such a homologous bacterial cellular component. - li- - 

Possible bacterial target sequences are described herein by reference to sequence 
35 source sites. In preferred embodiments, the sequence encoding the target corresponds 
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to a S. aureus nucleic acid sequence available from numerous sources including £ 
aureus sequences deposited in GenBank, 5. aureus sequences found in European 
Patent Application No. 971001 10.7 to Human Genome Sciences, Inc. filed January 7, 
1997, S. aureus sequences available from TIGR at 
5 http://www.tigr.org/tdb/mdb/mdb.html. and S. aureus sequences available from the 
Oklahoma University S. aureus sequencing project at the following URL: 
http://www.genome.ou.edu/staph new.html . Such possible targets are particularly 
applicable to S aureus phages 77, 3A, 96, and 44 AHJD. 

The amino acid sequence of a polypeptide target is readily provided by 

10 translating the corresponding coding region. For the sake of brevity, the sequences 
are not reproduced herein. Also, in preferred embodiments, a target sequence 
corresponds to a S. aureus coding sequence corresponding to a sequence listed in 
Table 15 herein. The listing in Table 15 describes S. aureus sequences currently listed 
with GenBank. Again, for the sake of brevity, the sequences are described by 

15 reference to the database accession numbers instead of being written out in full herein. 
In cases where an entry for a coding region is not complete, the complete sequence 
can be readily obtained by routine methods, e.g., by isolating a clone in a phage host 
S. aureus genomic library, and sequencing the clone insert to provide the relevant 
coding region. The boundaries of the coding region can be identified by conventional 

20 sequence analysis and/or by expression in a bacterium in which the endogenous copy 
of the coding region has been inactivated and using subcloning to identify the 
functional start and stop codons for the coding region. 

Staphvloccus aureus phage 44 AHJD 
25 The present invention also can utilize the identification of naturally occuring 

DNA sequence elements within Staphylococcus aureus bacteriophage 44AHJD which 

encode proteins with antimicrobial activity. 

Such identification can utilize bioinformatics identification of specific proteins 

(ORFs) utilized by Staphylococcus aureus bacteriophage 44AHJD during the viral life 

30 cycle, resulting in a slowing or arrest of growth of the bacterial host, or in death, of 
the Staphylococcus aureus host including lysis of the infected bacteria. Thus, some of 
the bacteriophage 44AHJD DNA sequences encoding these proteins (ORFs) are 
predicted to encode antimicrobial functions. Information derived from these DNA 
sequences and translated ORFs can, in turn, be utilized to develop inhibitory _ 

35 compounds by peptidomimetics that can also function as antimicrobials. In addition, 
the identification of the host bacterial proteins that are targeted and inhibited by the 
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antimicrobial bacteriophage ORFs can themselves provide novel targets for drug 
discovery. 

The methodology described above is used to identify and characterize DNA 
sequences from Staphylococcus sp. bacteriophage 44 AHJD that have antimicrobial 
5 activity. As described in the Examples, the Staphylococcus aureus propagating strain 
(PS 44A), obtained from the Felix d'Herelle Reference Centre (#HER 1 101), was 
used as a host to propagate its phage 44AHJD, also obtained from the Felix d'Herelle 
Reference Centre (#HER 101). By sequencing, we found that bacteriophage 44 AHJD 
consists of 16,668 bp (Table 16) predicted to encode 73 ORFs greater than 33 amino 
10 acids (Tables 17 & 18). Computational analysis of the predicted protein products of 
Staphylococcus aureus bacteriophage 44AHJD identified homolgs in public sequence 
databases as listed inTable 19 and 20, along with the accompanying list of related 
proteins. 

From this analysis, it is apparent that 3 genes (ORF 3, 7, and 8) are related to 

15 structural proteins found in other bacteriophages. These include genes predicted to 
encode a tail protein (ORF 3), an upper collar/connector protein of the phage virion 
(ORF 7), and a lower collar protein (ORF 8). Bioinformatics has also identified one 
gene whose product is likely involved in phage DNA synthesis. One gene (ORF 1) 
shows significant homology to DNA polymerases of a number of bacteriophages, 

20 bacteria and fungi, and the product of this gene is likely responsible for replicating 
the genetic material of bacteriophage 44 AHJD. ORF 2 encodes a protein with 
homology to the dinC gene of Bacillus subtilis that encodes a protein involved in 
teichoic acid biosynthesis. Teichoic. acid is a polyphosphate polymer found in some, 
but not all, Gram positive organisms (and not in Gram negative organisms), where it 

25 is attached to the peptidoglycan layer. The phage protein may thus be involved in the 
synthesis of this material for incorporation into the cell wall, allowing enhanced lysis 
by the phage lysis enzymes or, as many enzymes can function in "reverse reactions", 
may be involved in its degradation allowing for penetration of the peptidoglycan and 
phage genome entry into the cell following adsorption. The similarity between 

30 Staphylococcus aureus bacteriophage 44 AHJD and E. coli phage T7 indicates that 
they may share similar mechanisms of replication and growth. Both phages belorigto 
the Pododviridae Family of bacteriophages and are members of the "T7-like" Genus 
of this Family (Ackermann and DuBow; Vlth ICTV Report). 
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Two genes, ORF 9 and 12, were identified with the potential to encode 
antimicrobial protein products. The homology alignments are shown in Tables 19 and 
20. The predicted product of ORF 9 is related to a class of genes which encodes 
lysozyme-like functions, enzymes which cleave linkages in the mucopolysaccharide 
5 cell wall structure of a variety of micro-organisms, including that from the 

Staphylococcus aureus bacteriophage Twort. ORF 12 of Staphylococcus aureus 
bacteriophage 44AHJD shows homology to a set of lysis proteins from several 
bacteriophages. These lysis proteins are also referred to as holins, and represent 
phage-encoded lysis functions required for transit of the phage murein hydrolases 
10 (lysozyme) to the periplasm, where it can digest the cell wall and thus lyse the 
bacterium. 

Thus, in particular embodiments, the present invention provides a nucleic acid 
sequence isolated from Staphylococcus aureus bacteriophage 44AHJD comprising at 
least a portion of one of the genes described above with antimicrobial activity. For 

15 example, ORF 1 encodes a DNA polymerase function. This polymerase may utilize 
host-derived accessory proteins for its activity when replicating the phage template, 
sequestering such proteins from use by the bacterial polymerase, resulting in 
inhibition of DNA replication, cell division, and cell growth. Alternatively, ORF 9 
directly encodes a polypeptide with antimicrobial activity. ORF 9 is predicted to 

20 encode an amidase, a protein known to act as a cell wall degrading enzyme. ORF 12 
likely encodes a holin function required for transit of the phage amidase (gene 9 
product) to the periplasm. When this type of gene product from Bacillus phage phi 29 
(gene 14), was cloned in Escherichia coli 9 cell death ensued (Steiner et al., 1993). 
Thus, production of proteins from Bacillus phage phi 29 gene 14 in £. coli resulted in 

25 cell death, whereas production of protein from Bacillus phage phi 29 gene 14 

concomitantly with the phi 29 lysozyme or unrelated murein-degrading enzymes led 
to lysis, suggesting that membrane-bound protein 14 induces a nonspecific lesion in 
the cytoplasmic membrane (Steiner et al., 1993). 

The present invention also provides the use of the Staphylococcus 

30 bacteriophage 44 AHJD antimicrobial ORFs or ORF products as pharmacological 
agents, either wholly or in part and derivatives, as well as the use of corresponding 
peptidomimetics, developed from amino acid or nucleotide sequence knowledge 
derived from Staphylococcus bacteriophage 44 AHJD killer ORFs. 
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Enterococcus phage 182 

Bacteriophage 182 was obtained from the Felix D'Herelle phage collection 

(Ste. Foy, Quebec) and infects Enterococcus sp. Group D. The genome of 

5 Enterococcus bacteriophage 182 consists of 17,833 bp (Table 21) and is predicted to 

encode 80 ORFs greater than 33 amino acids (Tables 22 and 23). Computational 

analysis of the predicted protein products of Enterococcus bacteriophage 182 was 

performed in order to identify protein products related to those deposited in public 

databases. Bacteriophage 182 protein products which detected sequences with 

10 significant sequence similarity in public databases are listed in Table 24 and 26, along 
with the accompanying list of related proteins. 

From this analysis, it is apparent that 5 genes (ORF 001, 004, 007, 009, and 
011) are related to structural proteins of several Bacillus phages - Bacillus 
bacteriophage PZA, phi-29, and B103. These include genes predicted to encode a tail 

15 protein (ORF 001), a head protein (ORF 004), and upper collar protein (ORF 007), a 
lower collar protein (ORF 009), and a pre-neck appendage protein (ORF 01 1). Two 
gene products are predicted to encode genes which direct phage morphogenesis - 
these are ORF 005 and 019. 

Bioinformatics has also identified three genes whose products are likely 

20 involved in phage DNA synthesis. One gene, ORF 002 shows significant homology to 
DNA polymerases of a number of bacteriophages, and the product of this gene is 
likely responsible for replicating the genetic material of bacteriophage 182. ORF 006 
encodes a protein with homology to the encapsidation proteins of several other 
bacteriophages, including Bacillus phage phi-29 (PI 1014), PZA (P07541), and B103 

25 (X99260) and Streptococcus phage CP-1 (Z47794). These gene products catalyze the 
in vivo and in vitro genome-encapsidation reaction (Garvey et al., 1985). Proteins 
involved in genome packaging have been shown to have additional activities that 
affect biochemical reactions in other phages and their hosts. For example, the coat 
protein of the RNA bacteriophage MS2 interacts with viral RNA to translationally 

30 repress replicase synthesis (Pickett and Peabody, 1993). This protein-RNA interaction 
also plays a role in genome encapsidation, enveloping a single copy of the viral " 
genome in a protein shell composed of many molecules of coat protein. In addition, 
the bacteriophage X terminase enzyme can be lethal to E. coli when expressed, 
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suggesting cleavage of packaging sites in the bacterial chromosome. Also present 
within bacteriophage 182 is a gene, ORF 010, that encodes a protein that is related to 
the terminal proteins of Bacillus phage Nf (P06812), Bacillus phage GA-1 (X96987) 
and Bacillus phage B103 (X99260). DNA terminal proteins are linked to the 5' ends 
5 of both strands of the genome and are essential for DNA replication playing a role in 
initial priming of DNA replication. The similarity between Enterococcus 
bacteriophage 182 and Bacillus phages phi-29, PZA, and B103 indicates that they 
may share similar mechanisms of replication and growth. Protein-primed DNA 
replication is a well described phenomenon, and in the phi-29-like phages, the ends of 
10 the DNA serve as origins and termini of replication (Gutierrez et al., 1986; Yoshikawa 
etal., 1985). 

There is also a gene (ORF 015) that encodes a protein showing homology to 
an early protein product of Bacillus bacteriophage PZA and the single-strand nucleic 
acid binding protein of bacteriophage B103. 

15 Two genes, ORF 008 and 014, were identified with the potential to encode 

anti-microbial protein products. The homology alignments are shown in Tables 24 & 
26 and biochemical features of the predicted polypeptides shown in Table 25. The 
predicted product of ORF 008 is related to a class of genes which encodes lysozyme- 
like functions, enzymes which cleave linkages in the mucopolysaccharide cell wall 

20 structure of a variety of micro-organisms. ORF 014 of Enterococcus 182 shows 
homology to a set of lysis proteins from Bacillus bacteriophage phi-29, PZA, and 
B103. These lysis proteins are also referred to as holins and represent phage encoded 
lysis functions required for transit of the phage murein hydrolases (lysozyme) to the 
periplasm, where it can digest the outer cell wall and thus lyse the bacterium. 

25 Thus, the present invention provides a nucleic acid sequence obtained from 

Enterococcus bacteriophage 182 comprising at least a portion of a phage 182 ORF, 
preferably an inhibitory ORF, and more preferably at least a portion of one of the 
genes described above with anti-microbial activity. For example, ORF 002 encodes a 
DNA polymerase function. This polymerase may utilize host-derived accessory 

30 proteins for its activity when replicating the phage template, sequestering such 
proteins from use by the bacterial polymerase, resulting in inhibition of DNA 
replication, cell division, and cell growth. Alternatively, ORFs 008 or 014 directly 
encode polypeptides with anti-microbial activity. ORF 008 is predicted to encode an 
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autolytic lysozyme, a protein known to have anti-microbial activity (Martin et a/., 
1998). ORF 014 likely encodes a holin function required for transit of the phage 
murein hydrolases to the periplasm. When the related product from Bacillus phage phi 
29 (gene 14), was cloned in Escherichia coli, cell death ensued (Steiner et al. f 1993). 
5 Thus, production of proteins from Bacillus phage phi 29 gene 14 in E. coli resulted in 
cell death, whereas production of protein from Bacillus phage phi 29 gene 14 
concomitantly with the phi 29 lysozyme or unrelated murein-degrading enzymes led 
to lysis, suggesting that membrane-bound protein 14 induces a nonspecific lesion in 
the cytoplasmic membrane (Steiner et aL, 1993). 

10 The present invention also provides the use of the Enterococcus bacteriophage 

182 anti-microbial ORFs as pharmacological agents, either wholly or in part and 
derivatives, as well as the use of corresponding peptidomimetics, developed from 
amino acid or nucleotide sequence knowledge derived from Enterococcus 
bacteriophage 182 killer ORFs. This can be done where the structure of the 

15 peptidomimetic compound corresponds to the structure of the active portion of a 
product of an ORF. In this analysis, the peptide backbone is transformed into a carbon 
based hydrophobic structure that can retain cytostatic or cytocidal activity for the 
bacterium. This is done by standard medicinal chemistry methods, measuring growth 
inhibition of the various molecules in liquid cultures or on solid medium. These 

20 mimetics also represent lead compounds for the development of novel antibiotics. In 
this context, "corresponds" means that the peptidomimetic compound structure has 
sufficient similarities to the structure of the active portion of a product of one of the 
Enterococcus ORFs listed, that the peptidomimetic will interact with the same 
molecule as the product of the ORF, and preferably will elicit at least one cellular 

25 response in common which relates to the inhibition of the cell by the phage protein. 

To validate the identity of an ORF as a killer ORF, it is preferably expressed 
in the host or other test bacterial organism and the effect of this expression on 
bacterial growth and replication is assessed. Therefore, all individual ORFs identified 
herein, e.g., those identified above, can be expressed, preferably overexpressed, in a 

30 suitable host bacterium e.g., a host Enterococcus and the effect of this expression or 
overexpression on host metabolism and viability can be measured. - 

Individual ORFs can be resynthesized from the phage genomic DNA by the 
polymerase chain reaction (PCR) using oligonucleotide primers flanking the ORF on 
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either side. Those skilled in the art are familiar with the design and synthesis of 
appropriate primer sequences. These single ORFs are preferably engineered so that 
they contain appropriate cloning sites at their extremities to allow their introduction 
into a new bacterial expression plasmid, allowing propagation in a standard bacterial 
5 host such as E. coli, but containing the necessary information for plasmid replication 
in the target microbe, Enterococcus sp. (hereafter referred to as a shuttle vector). 

This shuttle vector also preferably contains regulatory sequences that allow 
inducible expression of the introduced ORF. As the candidate ORF may encode a 
killer function that will eliminate the host, it is highly advantageous that it not be 

10 expressed (or at least not expressed at a substantial level) prior to testing for activity; 
thus screening for such sequences in a constitutive fashion is less likely to be 
successful (lethality). In an example presented in Fig. 7, regulatory sequences from 
the ars operon are used to direct individual ORF expression in Enterococcus. The ars 
operon encodes a series of proteins which normally mediate the extrusion of arsenite 

15 and several other trivalent oxyanions from the cells when they are exposed to such 
toxic substances in their environment. The operon encoding this detoxifying 
mechanism is normally silent and only induced when arsenite-related compounds are 
present. 

Therefore, individual phage ORFs can be expressed in Enterococcus or other 
20 suitable host in an inducible fashion by adding to the culture medium non-toxic 
arsenite concentrations during the growth of individual Enterococcus (or other host 
cells) clones expressing such individual phage ORFs. Toxicity of the phage killer 
ORF for the host is monitored by reduction or arrest of growth under induction 
conditions, as measured by optical density in liquid culture or after plating the 
25 induced cultures on solid medium. Subsequently, interference of the phage ORF with 
the host biochemical pathways ultimately leading to reducing or arresting host 
metabolism can be measured by pulse chase experiments using radiolabeled 
precursors of either DNA replication, RNA transcription, or protein synthesis. 

Of course, other inducible regulatory sequences (e.g., promoters, operators, 
30 etc.) may be used (e.g., systems using positive induction of expression or systems 
using release of repression). A variety of such systems are known to those* skilled in 
the art and can be utilized in the present invention. 
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Nucleic acid sequences of the present invention can be isolated using a method 
similar to those described herein or other methods known to those skilled in the art. 
In addition, such nucleic acid sequences can be chemically synthesized by well- 
known methods. Having the phage 182 ORFs, e.g., anti-bacterial ORFs of the present 
5 invention, portions thereof, or oligonucleotides derived therefrom as described, other 
anti-microbial sequences from other bacteriophage sources can be identified and 
isolated using methods described here or other methods, including methods utilizing 
nucleic acid hybridization and/or computer-based sequence alignment methods. 

The invention also provides bacteriophage anti-microbial DNA segments from 

10 other phages based on nucleic acids and sequences hybridizing to the presently 
identified inhibitory ORF under high stringency conditions or sequences which are 
highly homologous. The bacteriophage anti-microbial DNA segment from 
bacteriophage 182 can be used to identify a related segment from another unrelated 
phage based on stringent conditions of hybridization or on being a homolog based on 

15 nucleic acid and/or amino acid sequence comparisons. As with the phage 182 

inhibitory sequences, such homologous coding sequences and products can be used as 
antimicrobials, to construct active portions or derivatives, to construct 
peptidomimetics, and to identify bacterial targets. 

Enterococcus sequences are listed in Table 27 by accession number, providing 

20 identification of possible targets of Enterococcus phage inhibitory ORF products, eg., 
from phage 182. 

Streptococcus pneumoniae 

As indicated in the Summary above, the present invention is concerned 

25 with the use of Streptococcus sp. bacteriophage Dp-1 coding sequences and the 

encoded polypeptides or RNA transcripts to identify bacterial targets for potential new 

antibacterial agents. 

Streptococcus pneumoniae is an important cause of community-acquired 
pneumonia and a major cause of otitis media, sinusitis, and meningitis in children and 
30 adults. In Spain and other Mediterranean countries, the majority of S. pneumoniae are 
relatively resistant to penicillin (Klugman, 1990; Fenoll et al., 1991; Jorgenserretal., 
1990). These strains also have decreased susceptibility to broad-spectrum 
cephaloporins, which are frequently used in the empiric treatment of meningitis and 
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other serious invasive bacterial infections. High-level resistance of pneumococci has 
been encountered in Hungary where 70% of children who were colonized with S. 
pneumoniae carried penicillin resistant strains that were also resistant to tetracycline, 
erythromycin, trimethoprim/sulfamethoxazole, and 30% resistant to chloramphenicol 
5 (Neu, 1992). The resistance of pneumococci to macrolides such as erythromycin 
averages 20-25% in France, -20% in Japan, and <10% in Spain (Neu, 1992). 

The antimicrobial susceptibilities and distribution of serotypes of the 42 
isolates of S. pneumoniae in southern Taiwan from invasive infections have been 
recently determined (Hseuh et al., 1996). Resistance rates among these isolates were: 
10 erythromycin, 61.9%; clindamycin, 47.6%; chloramphenicol, 19%; and tetracycline, 
73.8%. Resistance to three or more classes of antibiotics was found in 33.3% of the 
isolates. Bacteremic pneumonia and primary bacteremia accounted for 64.3% of the 
infections and mortality was 42.6%. Given the severity of these infections despite 
adequate antibiotic therapy, there is clearly a need for introduction of new therapeutic 
1 5 options to prevent mortality due to invasive S. pneumoniae infections. 

Pneumococcal phages belong to four families and they present a great variety 
in morphology, including lytic and temperate phages (for a review, see Garcia et al., 
1997). Examples of lytic phages are Cp-1 and Dp-1, whereas examples of temperate 
phages are HB-3, EJ-1, and HB-746. The complete nucleotide sequence and 
20 functional organization of Cp-1 has been reported (Martin et al., 1996). Cp-1 has a 
19,345 bp double-stranded DNA genome, with a terminal protein covalently linked to 
its 5* ends, that replicates by a protein primed mechanism. The phage contains 29 
ORFs, 23 on one strand and 6 on the opposite. When these predicted proteins were 
compared to sequences compiled in GenBank EMBL databases, to ORFs showed 
25 significant similarity to proteins of bacteriophage 29 that infects B. subtilis (Martin et 
al., 1996). The similar proteins corresponded to those involved in DNA replication 
(terminal protein and DNA polymerase), structural and morphogenic proteins (major 
head, collar, connector, tail, and encapsidation proteins), and proteins involved in lysis 
function (holin and lysozyme). In its strategy of lysis, the holin gene product inserts 
30 itself into the cell membrane, allowing access of the lysozyme to the peptidoglycan.. 
Expression of the Cp-1 holin protein in £. coli results in cell death after 2- hours of 
induction, but did not lead to lysis (Garcia et al., 1997). Cells harboring a plasmid 
construction with holin and lysozyme genes together did lyse after induction and the 
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viability loss was similar to that of the culture expressing holin alone. Cloning of 
these lytic genes in S, pneumoniae showed that both genes had the same effect as in E. 
coli. That is, holin itself did not lyse the culture but the viability loss was noticeable, 
whereas both holin and lysozyme together were capable of lysing M31, an amidase 
5 deleted mutant (Garcia et aL, 1 997). 

Recently, a small portion (-4 kbp) of a second S. pneumoniae phage, Dp-1, 
has been sequenced (Sheehan et aL, 1997). This portion contains the genes coding for 
the lytic system (Sheehan et aL, 1997) and shows a modular organization similar to 
that described for Cp-1. However, in this case, a single chimeric protein appears to be 

10 made in which the N-terminal domain is highly similar to that of the murein hydrolase 
coded by a gene found in the phage BK5-T that infects Lactococcus lactis, and the C- 
terminal domain is homologous to holins. Thus, both functions appear to have been 
combined in a novel chimeric protein. 

Bacteriophage Dp-1 was obtained from Dr. P. Garcia (Departamento de 

15 Microbiologia Molecular, Centro de Departamento de Investigaciones Biologicas, 
Consejo Superior de Investigaciones Cientificas, Velazquez, Madrid, Spain). We 
found that Dp-1 has a double-stranded DNA genome of 56,506 bp, predicted to 
encode 85 ORFs greater than 33 amino acids and with upstream Shine-Dalgarno 
motifs for translation initiation (Tables 28 & 30, and Fig. 6). Computational analysis 

20 of the predicted protein products of Streptococcus bacteriophage Dp-1 protein 
products, which detected homologs in public databases, are listed inTable 31, along 
with the accompanying list of related proteins. 

From this analysis, it is apparent that several predicted genes of Dp-1 encode 
polypeptides that are related to structural proteins. ORFs 001, 002, 004, and 030 are 

25 predicted to encode tail proteins, minor structural proteins, and minor capsid proteins 
(Table 31). We also note the identification of several gene products that are likely 
involved in DNA synthesis. These include ORF 3 which encodes DNA polymerase, 
ORF 8 which encodes a SWI/SNF helicase-related protein, ORF 10 encodes a protein 
showing homology to recA, and ORF 13 encodes a dnaZX-like ORF. 

30 In E. coli, RapA encodes an RNA polymerase (RNAP)-associated protein with 

ATPase activity and which is a homolog of the eukaryotic SWI/SNF family, a set of 
proteins whose members are involved are involved in transcription activation, 
nucleosome remodeling, and DNA repair. RapA forms a stable complex with RNAP, 
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as if it were a subunit of RNAP and it is possible that the ORF 8 product behaves 
similarly or in a dominant-negative fashion to inhibit the activity of RapA. Mutation 
of the essential E. coli dnaZX results in a block in DNA chain elongation during 
replication (Maki et ah, 1988). The dnaZX gene has only one open reading frame for 
5 a 71-kDa polypeptide from which the two distinct DNA polymerase III holoenzyme 
subunits, tau (71 kDa) and gamma (47 kDa), are produced. The tau subunit is the 
precursor of the gamma subunit, and the gamma subunit is produced by a -1 
frameshift causing early termination of translation (Tsuchihashi et al., 1990). These 
proteins show single-strand DNA binding properties that is ATPase (and dATPase) 

10 dependent and are thought to increasing the processivity of the core DNA polymerase 
enzyme (Lee et al., 1987). 

There are several Dp-1 ORFs which encode proteins predicted to play a role in 
cellular metabolic pathways. These include polypeptides involved in coenzyme PQQ 
synthesis (ORFs 20, 29, 38). Pyrrolo-quinoline quinone (PQQ) is the non-covalently 

15 bound prosthetic group of many quinoproteins catalysing reactions in the periplasm of 
Gram-negative bacteria. Most of these involve the oxidation of alcohols or aldose 
sugars. Interestingly, ORFs 20, 29, and 30 also show homology to the exoenzyme S 
regulon (Frank, 1997). Proteins encoded by the P. aeruginosa exoenzyme S regulon 
may be involved in a contact-mediated translocation mechanism to transfer anti-host 

20 factors directly into eukaryotic cells disrupting eukaryotic signal transduction through 
ADP-ribosylation (Frank, 1997). 

There is also a protein with similarity to OTP cyclohydrolase I (ORF 21) and 
ORF 41 which shows homology to dUTPase (Table 31). GTP cyclohydrolase I is an 
enzyme that catalyzes the first reaction in the pathway for the biosynthesis of the 

25 pteridine, a cofactor of the monooxygenases of the aromatic amino acids. Disruption 
of the homologous gene in Saccharomyces cerevisiae leads to a recessive conditional 
lethality due to folinic acid auxotrophy, that can be complemented with the 
mammalian or bacterial GTP cyclohydrolase I enzymes (Nardese et al., 1996; Mancini 
et al., 1999). 

30 ORF 16 shows high homology to autolysin. This region of the phage sequence 

was previously reported (Sheehan et al., 1997) and encompasses ~ 4 kbp of our 
sequence. The sequence published by (Sheehan et al., 1997) is shown in Table 32. 

Thus, the present invention provides a nucleic acid sequence obtained from 
Streptococcus bacteriophage Dp-1 comprising at least a portion of a phage Dp-1 OJLF? - 

35 preferably an inhibitory ORF, and more preferably at least a portion of one of the 
genes described above with anti-microbial activity. For example, ORF 013 encodes a 
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protein with homology to the gamma subunit of DNA polymerase (dnaX gene). This 
protein may act in a dominant-negative fashion to sequester the host DNA polymerase 
for its own replication, thus inhibiting host DNA replication. The dnaX gene product 
is essential for E. coli replication (Kodaira et al., 1983). 

5 In certain preferred embodiments of the present invention, the bacterial target of 

a bacteriophage inhibitor ORF product, e.g., an inhibitory protein or polypeptide, is 
encoded by a Streptococcus nucleic acid coding sequence from a host bacterium for 
bacteriophage Dp-1. As above, possible target sequences are described herein by 
reference to sequence source sites. The sequence encoding the target preferably 

10 corresponds to a Streptococcus nucleic acid sequence available from The Institute for 
Genomic Research (TIGR), or available from GenBank or other public database. The 
TIGR Streptococcus sequences are publicly available at The Institute for Genomics 
Research at URL: http://www.tigr.org 

The amino acid sequence of a polypeptide target is readily provided by 

15 translating the corresponding coding region. For the sake of brevity, the sequences 
are not reproduced herein. Also, in preferred embodiments, a target sequence 
corresponds to a Streptococcus pneumoniae coding sequences corresponding to a 
sequence listed in Table 33 herein. Sequences for other Streptococcal species are also 
available from TIGR and./or from GenBank. The listing in Table 33 describes 

20 Streptococcus sequences currently deposited in GenBank. Again, for the sake of 
brevity, the sequences are described by reference to the GenBank entries instead of 
being written out in full herein. In cases where the TIGR or GenBank entry for a 
coding region is not complete, the complete sequence can be readily obtained by 
routine methods, e.g., by isolating a clone in a phage Dp-1 host Streptococcus sp. 

25 genomic library, and sequencing the clone insert to provide the relevant coding 
region. The boundaries of the coding region can be identified by conventional 
sequence analysis and/or by expression in a bacterium in which the endogenous copy 
of the coding region has been inactivated and using subcloning to identify the 
functional start and stop codons for the coding region. 

30 In the various aspects of this invention involving Dp-1 sequences, preferably the 

sequence is preferably not contained in the sequence described in Sheehan et al., 1997 
(Table 32). 

Validating Identified Inhibitory Phage ORFs 
35 A fifth step involves validating the identified phage inhibitor ORF by 

independent methods, and delineating further possible smaller segments of the ORFs 
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that have inhibitory activity. Several methods exist to validate the role of the 
identified ORF as an inhibitor ORF. 

One example utilizes the creation of a mutant variant of the phage ORF in 
which the candidate ORF carries a partial or complete loss-of- function mutation that 
5 is measurable as compared with the non-mutant ORF. Comparison of the effects of 
expression of the loss of function mutant with the normal ORF provides confirmation 
of the identification of an inhibitor ORF where the loss-of-function mutant provides a 
measurably lower level of inhibition, preferably no inhibition. The loss of function 
may be conditional, e.g., temperature sensitive. 

10 Once validation of the inhibitor ORF is achieved, a bi-directional deletion 

analysis can be carried out using the same experimental system to identify the 
minimal polypeptide segment that has inhibitor activity. This may be carried out by a 
variety of means, e.g., by exonuclease or PCR methodologies, and is used to 
determine if a relatively small segment of the ORF (i.e., the product of the ORF) still 

15 possesses inhibitory activity when isolated away from its native sequence. If so, a 
portion of the ORF encoding this "active portion" can be used as a template for the 
synthesis of novel anti-microbial agents and further allowing derivation of the peptide 
sequence, e.g., using modified peptides and/or peptidomimetics. 

In creation of certain peptidomimetics, the peptide backbone is transformed 

20 into a carbon-based hydrophobic structure that can retain inhibitor activity against the 
bacterium. This is done by standard medicinal chemistry methods, typically 
monitored by measuring growth inhibition of the various molecules in liquid cultures 
or on solid medium. These mimetics can also represent lead compounds for the 
development of novel antibiotics. 

25 Recently, a major effort has been undertaken by the pharmaceutical industry 

and their biotechnology partners for the sequencing of bacterial pathogen genomes. 
The rationale is that the systematic sequencing of the genome will identify all of the 
bacterial proteins and therefore this proteome will be the target for designing novel 
inhibitor antibiotics. Although systematic, this approach has several major problems. 

30 The first is that analysis of primary amino acid sequences of bacterial proteins does 
not immediately reveal which protein will be essential for viability of the bacterium, 
and target validation is thus a major issue. The second problem is one of redundancy, 
as several biochemical pathways are either structurally duplicated in bacteria 
(different isoforms of the same enzyme), or functionally duplicated by the presence o£ 

35 salvage pathways in the event of a metabolic block in one pathway (different 

nutritional conditions). The third is that even a valid target may not be structurally or 
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functionally amenable to inhibition by small molecules because of inaccessibility 
(sequestration of target). 

Therefore, there is considerable interest within the pharmaceutical and 
biotechnology industry in identifying key targets for drug discovery amongst the mass 
5 of novel targets generated by large-scale genomic sequencing projects. 

On the other hand, and underscoring the instant invention, the phages herein 
described have, over millions of years, evolved specific mechanisms to target such 
key biochemical pathways and proteins. In the few cases where inhibition by phages 
has been elucidated (e.g., see ref. 3), such bacterial targets are invariably rate-limiting 
1 0 in their respective biochemical pathways, are not redundant, and/or are readily 
accessible for inhibition by the phage (or by another inhibitory compound). 
Therefore, the sixth step of this invention involves identifying the host biochemical 
pathways and proteins that are targeted by the phage inhibitory mechanisms. 

15 Identifying, Validating, and Characterizing Bacterial Host Target Proteins and 
Affected Pathways 

A rationale for this step is that the inhibitor ORF product from the phage 
physically interacts with and/or modifies certain microbial host components to block 
their function. Exemplary approaches which can be used to identify the host bacterial 

20 pathways and proteins that interact with, and preferably also are inhibited by, phage 
ORF product(s) are described below. 

One approach is a genetic screen to determine physiological proteinrprotein 
interaction, for example, using a yeast two hybrid system. In this assay, the phage 
ORF is fused to the carboxyl terminus of the yeast Gal4 activation domain II (amino 

25 acids 768-881) to create a bait vector. A cDNA library of cloned S. aureus sequences 
which have been engineered into a plasmid where the S. aureus sequences are fused to 
the DNA binding domain of Gal4 is also generated. These plasmids are introduced 
alone, or in combination, into yeast strain Y190 - previously engineered with 
chromosomally integrated copies of the £. coli lacZ and the selectable HIS3 genes, 

30 both under Gal4 regulation (Durfee, T., Becherer, K., Chen, P.-L., Yeh, S.-H., Yang, 
Y., Kilburn, A.E., Lee, W.-H., and Elledge, SJ. (1993). Genes & Dev. 7, 555-569). If 
the two proteins expressed in yeast interact, the resulting complex will activate 
transcription from promoters containing Gal4 binding sites. A lacZ and His3 gene, 
each driven by a promoter containing Gal4 binding sites, have been integrated into the. T 

35 genome of the host yeast system used for measuring protein-protein interactwnsTSuch 
a system provides a physiological environment in which to detect potential protein 
interactions. This system has been extensively used to identify novel protein-protein 
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interaction partners and to map the sites required for interaction (for example, to 
identify interacting partners of translation factors (Qiu, H. s Garcia-Barrio, M.T., and 
Hinnebusch, A.G. (1998). Mol & Cell Biology 18, 2697-271 1), transcription factors 
(Katagiri, T., Saito, H., Shinohara, A., Ogawa, H., Kamada, N., Nakamura ,Y., and 
5 Miki, Y. (1998). Genes, Chromosomes & Cancer 21, 217-222), and proteins involved 
in signal transduction (Endo, T.A., Masuhara,M., Yokouchi, M., Suzuki, R., 
Sakamoto, H., Mitsui, K., Matsumoto, A., Tanimura, S., Ohtsubo, M., Misawa, H., 
Miyazaki, T., Leonor N., Taniguchi, T., Fujita, T., Kanakura, Y., Komiya, S., and 
Yoshimura, A. Nature. 387, 921-924). This approach has also been used in many 
10 published reports to identify interaction between mammalian viral and mammalian 
cell proteins. 

For example, the non-structural protein NS1 of parvovirus is essential for viral 
DNA amplification and gene expression and is also the major cytopathic effector of 
these viruses. A yeast two-hybrid screen with NS1 identified a novel cellular protein 

15 of unknown function that interacts with NS- 1 , called SGT, for small glutamine-rich 
tetratricopeptide repeat (TPR)-containing protein (Cziepluch C. Kordes E. Poirey R. 
Grewenig A. Rommelaere, J, and Jauniaux JC. (1998) J Virol 72, 4149-4156). In 
another screen, the adenovirus E3 protein was recently shown to interact with a novel 
tumor necrosis factor alpha-inducible protein and to modulate some of the activities of 

20 E3 (Li Y. Kang J. and Horwitz M.S. (1998). Mol & Cell Biol 18, 1601-1610). In yet 
another recent screen, the herpes simplex virus 1 alpha regulatory protein ICPO was 
found to interact with (and stabilize) the cell cycle regulator cyclin D3 (Kawaguchi Y. 
Van Sant C. and Roizman B. (1997). J Virol 71,7328-7336). 

Another two-hybrid system for identifying protein:protein interactions is 

25 commercially available from STRATEGENE™ as the CYTO-TRAP™ system 
(Chang et al., Strategies Newsletter 11(3), 65-68 (1998)(from Stratagene)). The 
system is a yeast-based method for detecting protein:protein interactions in vivo, using 
activation of the Ras signal transduction cascade by localizing a signal pathway 
component, human Sos (hSos), to its activation site in the yeast plasma membrane. 

30 The system uses a temperature-sensitive Saccharomyces cerevisiae mutant, strain 
cdc25H, which contains a point mutation at amino acid residue 1328 of the cdc25 
gene. This gene encodes a guanyl nucleotide exchange factor which binds and 
activates Ras, leading to cell growth. The mutation in the cdc25 gene prevents host 
growth at 37°C, but at a permissive temperature of 25°C, growth is normal. The 

35 system utilizes the ability of (hSos) to complement the cdc25 defect and activate the 
yeast Ras signaling pathway. Once (hSos) is expressed and localized to the plasma 
membrane, the cdc25H yeast strain grows at 37°C. Localizing hSos to the plasma 
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membrane occurs through a proteinrprotein interaction. A protein of interest, or bait, 
is expressed as a fusion protein with hSos. The library, or target proteins are 
expressed with the myristylation membrane-localization signal. The yeast cells are 
then incubated under restrictive conditions (37°C). If the bait and the target protein 
5 interact, the hSos protein is recruited to the membrane, activating the Ras signaling 
pathway and allowing the cdc25H yeast strain to grow at the restrictive temperature. 

The protein targets of phage inhibitory ORFs can also be identified using 
bacterial genetic screens. One approach involves the overexpression of a phage 
inhibitory protein in mutagenized bacterial host species, followed by plating the cells 

10 and searching for colonies that can survive the antimicrobial activity of the inhibitory 
ORF. These colonies are then grown, their DNA extracted, and cloned into an 
expression vector that contains a replicon of a different incompatibility group from 
the plasmid expressing the original ORF. This library is then introduced into a wild- 
type host bacterium in conjunction with an expression vector driving synthesis of the 

15 phage ORF, followed by selection for surviving bacteria. Thus, bacterial DNA 

fragments from the survivors presumably contain a DNA fragment from the original 
mutagenized host bacterial genome that can protect the cell from the antimicrobial 
activity of the inhibitory phage ORF. This fragment can be sequenced and compared 
with that of the bacterial host to determine in which gene the mutation lies. This 

20 approach enables one to determine the targets and pathways that are affected by the 
killing function. 

A second approach is based on identifying protein:protein interactions 
between the phage ORF product and bacterial S. aureus, e.g., proteins using a 
biochemical approach based, for example, on affinity chromatography. This approach 

25 has been used, for example, to identify interactions between lambda phage proteins 
and proteins from their E. coli host (Sopta, M, Carthew, R.W., and Greenblatt, J. 
(1985) /. Biol Chem. 260, 10353-10369). The phage ORF is fused to a peptide tag 
(e.g. glutathione-S-transferase ("GST"), 6xHIS, ("HIS") and/or calmodulin binding 
protein ("CPB")) within a commercially available plasmid vector that directs high 

30 level expression on induction of a suitably responsive promoter driving the fusion's 
expression. The translated fusion protein is expressed in E. coli, purified, and 
immobilized on a solid phase matrix via, for example the tag. Total cell extracts from 
the host bacterium, e.g., S. aureus, are then passed through the affinity matrix 
containing the immobilized phage ORF fusion protein; host proteins retained on the 

35 column are then eluted under different conditions of ionic strength, pH, detergents 
etc., and characterized by gel electrophoresis and other techniques. Appropriate 
controls are run to guard against nonspecific binding to the resin. Target proteins thus 
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recovered should be enriched for the phage protein/peptide of interest and are 
subsequently electrophoretically or otherwise separated, purified, sequenced, or 
biochemically analyzed. Usually sequencing entails individual digestion of the 
proteins to completion with a protease (e.g. -trypsin), followed by molecular mass and 
5 amino acid composition and sequence determination using, for example, mass 

spectrometry, e.g., by MALDI-TOF technology (Qin, J., Fenyo, D., Zhao, Y., Hall, 
W.W., Chao, D.M., Wilson, C.J., Young, R.A. and Chait, B.T. (1997). Anal. Chem. 
69, 3995-4001). 

The sequence of the individual peptides from a single protein are then 

10 analyzed by the bioinformatics approach described above to identify the S. aureus 
protein interacting with the phage ORF. This analysis is performed by a computer 
search of the S. aureus genome for an identified sequence. Alternatively, all tryptic 
peptide fragments of the S. aureus genome can be predicted by computer software, 
and the molecular mass of such fragments compared to the molecular mass of the 

15 peptides obtained from each interacting protein eluted from the affinity matrix. The 
responsible gene sequence can be obtained, for example by using synthetic degenerate 
nucleic acid sequences to pull out the corresponding homologous bacterial sequence. 
Alternatively, antibodies can be generated against the peptide and used to isolate 
nascent peptide/mRNA transcript complexes, from which the mRNA can be reverse 

20 transcribed, cloned, and further characterized using the procedures discussed herein. 

A variety of other binding assay methods are known in the art and can be used 
to identify interactions between phage proteins and bacterial proteins or other bacterial 
cell components. Such methods that allow or provide identification of the bacterial 
component can be used in this invention for identifying putative targets. 

25 Validation of the interaction between the phage ORF product and the bacterial 

proteins or other components can be obtained by a second independent assay (e.g., 
co-immunoprecipitation or protein-protein crosslinking experiments (Qiu, H., Garcia- 
Barrio, M.T., and Hinnebusch, A.G. (1998). Mol & Cell Biology 18, 2697-271 1; 
Brown, S. and Blumenthal, T. (1976). Proc. Natl. Acad. ScL USA 73, 1 131-1135)). 

30 Finally, the essential nature of the identified bacterial proteins is preferably 

determined genetically by creating a constitutive or inducible partial or complete loss- 
of-fiinction mutation in the gene encoding the identified interacting bacterial protein. 
This mutant is then tested for bacterial survival and replication. 

The protein target of the phage inhibitor function can also be identified using a.. 

35 genetic approach. Two exemplary approaches will be delineated here. The firsT" 
approach involves the overexpression of a predetermined phage inhibitor protein in 
mutagenized host bacteria, e.g., S. aureus, followed by plating the cells and searching 
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for colonies that can survive the inhibitor. These colonies will then be grown, their 
DNA extracted and cloned into an expression vector that contains a replicon of a 
different incompatibility group, and preferably having a different selectible marker 
than the plasmid expressing the phage inhibitor. Thus, host DNA fragments from the 
5 mutant that can protect the cell from phage ORF inhibition can be sequenced and 
compared with that of the bacterial host to determine in which gene the mutation lies. 
This approach allows rapid determination of the targets and pathways that are affected 
by the inhibitor. 

Alternatively, the bacterial targets can be determined in the absence of 

10 selecting for mutations using an approach known as "multicopy suppression". In this 
approach, the DNA from the wild type host is cloned into an expression vector that 
can coexist, as previously described, with one containing a predetermined phage 
inhibitor. Those plasmids that contain host DNA fragments and genes that protect the 
host from the phage inhibitor can then be isolated and sequenced to identify putative 

15 targets and pathways in the host bacteria. 

Regardless of the specific mode of identification, screening assays may 
additionally utilize gene fusions to specific "reporter genes" to identify a bacterial 
gene(s) whose expression is affected when the host target pathway is affected by the 
phage inhibitor. Such gene fusions can be used to search a number of small molecule 

20 compounds for inhibitors that may affect this pathway and thus cause cell inhibition. 
This approach will allow the screening of a large number of molecules on petri dishes 
or 96-well format by monitoring for a simple color change in the bacterial colonies. 
In this manner, we can validate host targets and classes of compounds for further 
study and clinical development. These inhibitors also represent lead compounds for 

25 the development of other antibiotics. 

Bioinformatics and comparative genomics are preferably then applied to the 
identified bacterial gene products to predict biochemical function. The biochemical 
activity of the protein can be verified in vitro in cell free assays or in vivo in intact 
cells. In vitro biochemical assays utilizing cell-free extracts or purified protein are 

30 established as a basis for the screening and development of inhibitors. 

These inhibitors, preferably small molecule inhibitors, may comprise peptides, 
antibodies, products from natural sources such as fungal or plant extracts or small 
molecule organic compounds. In general, small molecule organic compounds are 
preferred. These compounds may, for example, be identified within large compound _ 

35 libraries, including combinatorial libraries. For example, a plurality of compounds, 
preferably a large number of compounds can be screened to determine whether any of 
the compounds binds or otherwise disrupts or inhibits the identified bacterial target. 
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Compounds identified as having any of these activities can then be evaluated further 
in cell culture and/or animal model systems to determine the pharmacological 
properties of the compound, including the specific anti-microbial ability of the 
compound. 

5 For mixtures of natural products, including crude preparations, once a 

preparation or fraction of a preparation is shown the have an anti-microbial activity, 
the active substance can be isolated and identified using techniques well known in the 
art, if the compound is not already available in a purified form. 

Identified compounds possessing anti-microbial activity and similar 
10 compounds having structural similarity can be further evaluated and, if necessary, 
derivatized according to synthesis and/or modification methods available in the art 
selected as appropriate for the particular starting molecule. 

Derivatization of identified anti-microbials 

15 In cases where the identified anti-microbials above might represent peptidal 

compunds, the in vivo effectiveness of such compounds may be advantageously 
enhanced by chemical modification using the natural polypeptide as a starting point 
and incorporating changes that provide advantages for use, for example, increased 
stability to proteolytic degradation, reduced antigenicity, improved tissue penetration, 

20 and/or improved delivery characteristics. 

In addition to active modifications and derivative creations, it can also be 
useful to provide inactive modifications or derivatives for use as negative controls or 
introduction of immunologic tolerance. For example, a biologically inactive 
derivative which has essentially the same epitopes as the corresponding natural 

25 antimicrobial can be used to induce immunological tolerance in a patient being 

treated. The induction of tolerance can then allow uninterrupted treatment with the 
active anti-microbial to continue for a significantly longer period of time. 

Modified anti-microbial polypeptides and derivatives can be produced using a 
number of different types of modifications to the amino acid chain. Many such 

30 methods are known to those skilled in the art. The changes can include, for example, 
reduction of the size of the molecule, and/or the modification of the amino acid 
sequence of the molecule. In addition, a variety of different chemical modifications of 
the naturally occurring polypeptide can be used, either with or without modifications 
to the amino acid sequence or size of the molecule. Such chemical modifications can, 

35 for example, include the incorporation of modified or non-natural amino acids of^nofi- 
amino acid moieties during synthesis of the peptide chain, or the post-synthesis 
modification of incorporated chain moieties. 
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The oligopeptides of this invention can be synthesized chemically or through 
an appropriate gene expression system. Synthetic peptides can include both naturally 
occurring amino acids and laboratory synthesized, modified amino acids. 

Also provided herein are functional derivatives of anti-microbial proteins or 
5 polypeptides. By "functional derivative" is meant a "chemical derivative," 

"fragment," "variant," "chimera," or "hybrid" of the polypeptide or protein, which 
terms are defined below. A functional derivative retains at least a portion of the 
function of the protein, for example reactivity with a specific antibody, enzymatic 
activity or binding activity. 

10 A "chemical derivative" of the complex contains additional chemical moieties 

not normally a part of the protein or peptide. Such moieties may improve the 
molecule's solubility, absorption, biological half-life, and the like. The moieties may 
alternatively decrease the toxicity of the molecule, eliminate or attenuate any 
undesirable side effect of the molecule, and the like. Moieties capable of mediating 

1 5 such effects are disclosed in Alfonso and Gennaro (1 995). Procedures for coupling 
such moieties to a molecule are well known in the art. Covalent modifications of the 
protein or peptides are included within the scope of this invention. Such 
modifications may be introduced into the molecule by reacting targeted amino acid 
residues of the peptide with an organic derivatizing agent that is capable of reacting 

20 with selected side chains or terminal residues, as described below. 

Cysteinyl residues most commonly are reacted with alpha-haloacetates (and 
corresponding amines), such as chloroacetic acid or chloroacetamide, to give 
carboxymethyl or carboxyamidomethyl derivatives. Cysteinyl residues also are 
derivatized by reaction with bromotrifluoroacetone, chloroacetyl phosphate, N- 

25 alkylmaleimides, 3-nitro-2-pyridyl disulfide, methyl 2-pyridyl disulfide, p-chloro- 
mercuribenzoate, 2-chloromercuri-4-nitrophenol, or chloro-7-nitrobenzo-2-oxa-l,3- 
diazole. 

Histidyl residues are derivatized by reaction with diethylprocarbonate at pH 
5.5-7.0 because this agent is relatively specific for the histidyl side chain. Para- 
30 bromophenacyl bromide also is useful; the reaction is preferably performed in 0.1 M 
sodium cacodylate at pH 6.0. 

Lysinyl and amino terminal residues are reacted with succinic or other 
carboxylic acid anhydrides. Derivatization with these agents has the effect of 
reversing the charge of the lysinyl residues. Other suitable reagents for derivatizing 
35 primary amine- containing residues include imidoesters such as methyl _ 
picolinimidate; pyridoxal phosphate; pyridoxal; chloroborohydride; 
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trinitrobenzenesulfonic acid; O-methylisourea; 2,4 pentanedione; and transaminase- 
catalyzed reaction with glyoxylate. 

Arginyl residues are modified by reaction with one or several conventional 
reagents, among them phenylglyoxal, 2,3-butanedione, 1,2-cyclohexanedione, and 
5 ninhydrin. Derivatization of arginine residues requires that the reaction be performed 
in alkaline conditions because of the high pK, of the guanidine functional group. 
Furthermore, these reagents may react with the groups of lysine as well as the arginine 
alpha-amino group. 

Tyrosyl residues are well-known targets of modification for introduction of 

10 spectral labels by reaction with aromatic diazonium compounds or tetranitromethane. 
Most commonly, N-acetylimidizol and tetranitromethane are used to form O-acetyl 
tyrosyl species and 3-nitro derivatives, respectively. 

Carboxyl side groups (aspartyl or glutamyl) are selectively modified by 
reaction carbodiimide (R'-N-C-N-R') such as l-cyclohexyl-3-(2-morpholinyl(4-ethyl) 

15 carbodiimide or l-ethyl-3-(4-azonia-4,4-dimethylpentyI) carbodiimide. Furthermore, 
aspartyl and glutamyl residues are converted to asparaginyl and glutaminyl residues 
by reaction with ammonium ions. 

Glutaminyl and asparaginyl residues are frequently deamidated to the 
corresponding glutamyl and aspartyl residues. Alternatively, these residues are 

20 deamidated under mildly acidic conditions. Either form of these residues falls within 
the scope of this invention. 

Derivatization with Afunctional agents is useful, for example, for cross- 
linking component peptides to each other or the complex to a water-insoluble support 
matrix or to other macromolecular carriers. Commonly used cross-linking agents 

25 include, for example, 1,1 -bis (diazoacetyl)-2-phenylethane, glutaraldehyde, N- 

hydroxysuccinimide esters, for example, esters with 4-azidosalicylic acid, homobi- 
functional imidoesters, including disuccinimidyl esters such as 3,3 ? - 
dithiobis(succinimidylpropionate), and Afunctional maleimides such as bis-N- 
maleimido-l,8-octane. Derivatizing agents such as methyl-3-[p-azidophenyl) 

30 dithiolpropioimidate yield photoactivatable intermediates that are capable of forming 
crosslinks in the presence of light. Alternatively, reactive water-insoluble matrices 
such as cyanogen bromide-activated carbohydrates and the reactive substrates 
described in U.S. Patent Nos. 3,969,287; 3,691,016; 4,195,128; 4,247,642; 4,229,537; 
and 4,330,440 are employed for protein immobilization. 

35 Other modifications include hydroxylation of proline and lysine, - 

phosphorylation of hydroxyl groups of seryl or threonyl residues, methylation of the 
alpha-amino groups of lysine, arginine, and histidine side chains (Creighton, T.E., 
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Proteins: Structure and Molecular Properties, W.H. Freeman & Co., San Francisco, 
pp. 79-86 (1983)), acetyiation of the N-terminal amine, and, in some instances, 
amidation of the C-terminal carboxyl groups. 

Such derivatized moieties may improve the stability, solubility, absorption, 
5 biological half life, and the like. The moieties may alternatively eliminate or attenuate 
any undesirable side effect of the protein complex. Moieties capable of mediating 
such effects are disclosed, for example, in Alfonso and Gennaro (1995). 

The term "fragment" is used to indicate a polypeptide derived from the amino 
acid sequence of the protein or polypeptide having a length less than the full-length 

10 polypeptide from which it has been derived. Such a fragment may, for example, be 
produced by proteolytic cleavage of the full-length protein. Preferably, the fragment 
is obtained recombinantly by appropriately modifying the DNA sequence encoding 
the proteins to delete one or more amino acids at one or more sites of the C-terminus, 
N-terminus, and/or within the native sequence. 

15 Another functional derivative intended to be within the scope of the present 

invention is a "variant" polypeptide that either lacks one or more amino acids or 
contains additional or substituted amino acids relative to the native polypeptide. The 
variant may be derived from a naturally occurring polypeptide by appropriately 
modifying the protein DNA coding sequence to add, remove, and/or to modify codons 

20 for one or more amino acids at one or more sites of the C-terminus, N-terminus, 
and/or within the native sequence. 

A functional derivative of a protein or polypeptide with deleted, inserted 
and/or substituted amino acid residues may be prepared using standard techniques 
well-known to those of ordinary skill in the art. For example, the modified 

25 components of the functional derivatives may be produced using site-directed 
mutagenesis techniques (as exemplified by Adelman et al., 1983, DNA 2:183; 
Sambrook et al., 1989) wherein nucleotides in the DNA coding sequence are modified 
such that a modified coding sequence is produced, and thereafter expressing this 
recombinant DNA in a prokaryotic or eukaryotic host cell, using techniques such as 

30 those described above. Alternatively, components of functional derivatives of 
complexes with amino acid deletions, insertions and/or substitutions may be 
conveniently prepared by direct chemical synthesis, using methods well-known in the 
art. 

Insofar as other anti-microbial inhibitor compounds identified by the invention 
35 described herein may not be peptidal in nature, other chemical techniques exisfto 
allow their suitable modification, as well, and according the desirable principles 
discussed above. 
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Administration and Pharmaceutical Compositions 

For the therapeutic and prophylactic treatment of infection, the preferred 
method of preparation or administration of anti-microbial compounds will generally 
5 vary depending on the precise identity and nature of the anti-microbial being 

delivered. Thus, those skilled in the art will understand that administration methods 
known in the art will also be appropriate for the compounds of this invention. 

The particularly desired anti-microbial can be administered to a patient either 
by itself, or in pharmaceutical compositions where it is mixed with suitable carriers or 
10 excipient(s). In treating an infection, a therapeutically effective amount of an agent or 
agents is administered. A therapeutically effective dose refers to that amount of the 
compound that results in amelioration of one or more symptoms of bacterial infection 
and/or a prolongation of patient survival or patient comfort. 

Toxicity, therapeutic and prophylactic efficacy of anti-microbials can be 
15 determined by standard pharmaceutical procedures in cell cultures and/or 

experimental organisms such as animals, e.g., for determining the LD 50 (the dose 
lethal to 50% of the population) and the ED 50 (the dose therapeutically effective in 
50% of the population). The dose ratio between toxic and therapeutic effects is the 
therapeutic index and it can be expressed as the ratio LD 50 /ED 50 . Compounds that 
20 exhibit large therapeutic indices are preferred. The data obtained from these cell 

culture assays and animal studies can be used in formulating a range of dosage for use 
in humans. The dosage of such compounds lies preferably within a range of 
circulating concentrations that include the ED 50 with little or no toxicity. The dosage 
may vary within this range depending upon the dosage form employed and the route 
of administration utilized. 

For any compound identified and used in the method of the invention, the 
therapeutically effective dose can be estimated initially from cell culture assays. Such 
information can be used to more accurately determine useful doses in organisms such 
as plants and animals, preferably mammals, and most preferably humans. Levels in 
plasma may be measured, for example, by HPLC or other means appropriate for 
detection of the particular compound. 

The exact formulation, route of administration and dosage can be chosen by 
the individual physician in view of the patient's condition (see e.g. Fingl et. al., in The 
Pharmacological Basis of Therapeutics, 1975, Ch. 1 p.l). 

It should be noted that the attending physician would know how" and when to 
terminate, interrupt, or adjust administration due to toxicity, organ dysfunction, or 
other systemic malady. Conversely, the attending physician would also know to 
adjust treatment to higher levels if the clinical response were not adequate (precluding 
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toxicity). The magnitude of an administered dose in the management of the disorder 
of interest will vary with the severity of the condition to be treated and the route of 
administration. The severity of the condition may, for example, be evaluated, in part, 
by standard prognostic evaluation methods. Further, the dose and perhaps dose 
5 frequency, will also vary according to the age, body weight, and response of the 
individual patient. A program comparable to that discussed above also may be used 
in veterinary or phyto medicine. 

Depending on the specific infection target being treated and the method 
selected, such agents may be formulated and administered systemically or locally, i.e., 

10 topically. Techniques for formulation and administration may be found in Alfonso 
and Gennaro (1995). Suitable routes may include , for example, oral, rectal, 
transdermal, vaginal, transmucosal, intestinal, parenteral, intramuscular, 
subcutaneous, or intramedullary injections, as well as intrathecal, intravenous, or 
intraperitoneal injections. 

15 For injection, the agents of the invention may be formulated in aqueous 

solutions, preferably in physiologically compatible buffers such as Hanks' solution, 
Ringer's solution, or physiological saline buffer. For transmucosal administration, 
penetrants appropriate to the barrier to be permeated are used in the formulation. 
Such penetrants are generally known in the art. 

20 Use of pharmaceutical^ acceptable carriers to formulate identified anti- 

microbials of the present invention into dosages suitable for systemic administration is 
within the scope of the invention. With proper choice of carrier and suitable 
manufacturing practice, the compositions of the present invention, in particular those 
formulated as solutions, may be administered parenterally, such as by intravenous 

25 injection. Appropriate compounds can be formulated readily using pharmaceutically 
acceptable carriers well known in the art into dosages suitable for oral administration. 
Such carriers enable the compounds of the invention to be formulated as tablets, pills, 
capsules, liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by 
a patient to be treated. 

30 Agents intended to be administered intracellularly may be administered using 

techniques well known to those of ordinary skill in the art. For example, such agents 
may be encapsulated into liposomes, then administered as described above. 
Liposomes are spherical lipid bilayers with aqueous interiors. All molecules present 
in an aqueous solution at the time of liposome formation are incorporated into the 

35 aqueous interior. The liposomal contents are both protected from the external 

microenvironment and, because liposomes fuse with cell membranes, are efficiently 
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delivered into the cell cytoplasm. Additionally, due to their hydrophobicity, small 
organic molecules may be directly administered intracellularly. 

Pharmaceutical compositions suitable for use in the present invention include 
compositions wherein the active ingredients are contained in an effective amount to 
5 achieve the intended purpose. Determination of the effective amounts is well within 
the capability of those skilled in the art. 

In addition to the active ingredients, these pharmaceutical compositions may 
contain suitable pharmaceutical^ acceptable carriers comprising excipients and 
auxiliaries which facilitate processing of the active compounds into preparations 
10 which can be used pharmaceutically. The preparations formulated for oral 

administration may be in the form of tablets, dragees, capsules, or solutions, including 
those formulated for delayed release or only to be released when the pharmaceutical 
reaches the small or large intestine. 

The pharmaceutical compositions of the present invention may be 
manufactured in a manner that is itself known, e.g., by means of conventional mixing, 
dissolving, granulating, dragee-making, levitating, emulsifying, encapsulating, 
entrapping or lyophilizing processes. 

Pharmaceutical formulations for parenteral administration include aqueous 
solutions of the active anti-microbial compounds in water-soluble form. 
Alternatively, suspensions of the active compounds may be prepared as appropriate 
oily injection suspensions. Suitable lipophilic solvents or vehicles include fatty oils 
such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or triglycerides, 
or liposomes. Aqueous injection suspensions may contain substances which increase 
the viscosity of the suspension, such as sodium carboxymethyl cellulose, sorbitol, or 
dextran. Optionally, the suspension may also contain suitable stabilizers or agents 
which increase the solubility of the compounds to allow for the preparation of highly 
concentrated solutions. 

Pharmaceutical preparations for oral use can be obtained by combining the 
active compounds with solid excipient, optionally grinding a resulting mixture, and 
processing the mixture of granules, after adding suitable auxiliaries, if desired, to 
obtain tablets or dragee cores. Suitable excipients are, in particular, fillers such as 
sugars, including lactose, sucrose, mannitol, or sorbitol; cellulose preparations such 
as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, gum 
tragacanth, methyl cellulose, hydroxypropylmethyl-cellulose, sodium 
carboxymethylcellulose, and/or polyvinylpyrrolidone (PVP). If desired, - 
disintegrating agents may be added, such as the cross-linked polyvinyl pyrrolidone, 
agar, or alginic acid or a salt thereof such as sodium alginate. 
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Dragee cores are provided with suitable coatings. For this purpose, 
concentrated sugar solutions may be used, which may optionally contain gum arabic, 
talc, polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium 
dioxide, lacquer solutions, and suitable organic solvents or solvent mixtures. 
5 Dyestuffs or pigments may be added to the tablets or dragee coatings for identification 
or to characterize different combinations of active compound doses. 

Pharmaceutical preparations which can be used orally include push-fit 
capsules made of gelatin, as well as soft, sealed capsules made of gelatin and a 
plasticizer, such as glycerol or sorbitol. The push-fit capsules can contain the active 
10 ingredients in admixture with filler such as lactose, binders such as starches, and/or 
lubricants such as talc or magnesium stearate and, optionally, stabilizers. In soft 
capsules, the active compounds may be dissolved or suspended in suitable liquids, 
such as fatty oils, liquid paraffin, or liquid polyethylene glycols. In addition, 
stabilizers may be added. 
1 5 The above methodologies may be employed either actively or prophylactically 

against an infection of interest. 

Computer-related Aspects and Embodiments 

In addition to the provision of compounds as chemical entities, nucleotide 

20 sequences, or fragments thereof at least 95%, preferably at least 97%, more preferably 
at least 99%, and most preferably at least 99.9% identical to phage inhibitor sequences 
can also be provided in a variety of additional media to facilitate various uses. 

Thus, as used in this section, "provided" refers to an article of manufacture, 
rather than an actual nucleic acid molecule, which contains a nucleotide sequence of 

25 the present invention; a nucleotide sequence of an exemplary bacteriophage or a 
sequence encoding a bacterial target or a fragment thereof, preferably a nucleotide 
sequence at least 95%, more preferably at least 99% and most preferably at least 
99.9% identical to such a bacteriophage or bacterial sequence, for example, to a 
polynucleotide of an unsequenced phage listed in Table 1, preferably of bacteriophage 

30 77 (A aureus host) or bacteriophage 3A (S.aureus host) or bacteriophage 96 (S. 

aureus host). Such an article provides a large portion of the particular bacteriophage 
genome or bacterial gene and parts thereof (e.g., a bacteriophage open reading frame 
(ORF)) in a form which allows a skilled artisan to examine and/or analyze the 
sequence using means not directly applicable to examining the actual genome or gene r T 

35 or subset thereof as it exists in nature or in purified form as a chemical entity 
In one application of this aspect, a nucleotide sequence of the present 
invention can be recorded on computer readable media. As used herein, "computer 
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readable media" refers to any medium that can be read and accessed directly by a 
computer. Such media include, but are not limited to: magnetic storage media, such 
as floppy discs, hard disc storage medium, magnetic tape; optical storage media such 
as CD-ROM; electrical storage media such as RAM and ROM; and hybrids of these 
5 categories, such as magnetic/optical storage media. A skilled artisan can readily 
appreciate how any of the presently known computer readable mediums can be used 
to create an article of manufacture which includes one or more computer readable 
media having recorded thereon a nucleotide sequence or sequences of the present 
invention. Likewise, it will be clear to those of skill how additional computer 

10 readable media that may be developed also can be used to create analogous 

manufactures having recorded thereon a nucleotide sequence of the present invention. 

As used herein, "recorded" refers to a process for storing information on 
computer readable medium. A skilled artisan can readily adopt any of the presently 
known methods for recording information on computer readable medium to generate 

15 manufactures comprising the nucleotide sequence information of the present 
invention. 

A variety of data storage structures are available to a skilled artisan for 
creating a computer readable medium having recorded thereon a nucleotide sequence 
of the present invention. The choice of the data storage structure will generally be 

20 based on the means chosen to access the stored information. In addition, a variety of 
data processor programs and formats can be used to store the nucleotide sequence 
information of the present invention on computer readable medium. The sequence 
information can, for example, be presented in a word processing test file, formatted in 
commercially available software such as WordPerfect and Microsoft Word, or 

25 represented in the form of an ASCII file, stored in a database application, such as 
DB2, Sybase, Oracle, or the like. A skilled artisan can readily adapt any number of 
data processor structuring formats (e.g., text file or database) in order to obtain 
computer readable medium having recorded thereon the nucleotide sequence 
information of the present invention. 

30 Computer software is publicly available which allows a skilled artisan to 

access sequence information provided in a computer readable medium. Thus, by 
providing in computer readable form a nucleotide sequence of an unsequenced 
bacteriophage, such as an exemplary bacteriophage listed in Table 1 or of a sequence 
encoding a bacterial target or a fragment thereof, preferably a nucleotide sequence at . , 

35 least 95%, more preferably at least 99% and most preferably at least 99.9% identical 
to such a bacteriophage or bacterial sequence, for example, to a polynucleotide of 
bacteriophage 77 (S. aureus host) or bacteriophage 3A (S.aureus host) bacteriophage 
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96 (A aureus host), bacteriophage 44AHJD (S. aureus host), bacteriophage Dp-1 
{Streptococcus pneumoniae host), or bacteriophage 1 82 {Enierococcus host) the 
present invention enables the skilled artisan to routinely access the provided sequence 
information for a wide variety of purposes. 
5 Those skilled in the art understand that software can implement a variety of 

different search or analysis software which implement sequence search and analysis 
algorithms, e.g., the BLAST (Altschul et al., J. MoL Biol. 215:403410 (1990) and 
BLAZE (Brutlag et al., Comp. Chem 17:203-207 (1993)) search algorithms. For 
example, such search algorithms can be implemented on a Sybase system and used to 
10 identify open reading frames (ORFs) within the bacteriophage genome which contain 
homology to ORFs or proteins from other viruses, e.g, other bacteriophage, and other 
organisms, e.g., the host bacterium. Among the ORFs discussed herein are protein 
encoding fragments of the bacteriophage genomes which encode bacteria-inhibiting 
proteins or fragments. 

1 5 The present invention further provides systems, particularly computer-based 

systems, which contain the sequence information described. Such systems are 
designed to identify, among other things, useful fragments of the bacteriophage 
genomes. 

As used herein, "a computer-based system" refers to the hardware, software, 

20 and data storage media used to analyze the nucleotide sequence information of the 
present invention. The minimum hardware of the computer-based systems of the 
present invention comprises a central processing unit (CPU), input device, output 
device, and data storage medium or media. A skilled artisan will readily recognize 
that any of the currently available general purpose computer-based system are suitable 

25 for use in the present invention, as well as a variety of different specialized or 
dedicated computer-based systems. 

As stated above, the computer-based systems of the present invention 
comprise data storage media having stored therein a nucleotide sequence of the 
present invention and the necessary hardware and software for supporting and 

30 implementing a search and/or analysis program. 

As used herein, "data storage media" refers to memory which can store 
nucleotide sequence information of the present invention, or a memory access means 
which can access manufactures having recorded thereon the nucleotide sequence 
information of the present invention. 

35 As used herein, "search program" refers to one or more programs which are 

implemented on the computer-based system to compare a target sequence or target 
structural motif with the sequence information stored within the data storage means. 
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Search means are used to identify fragments or regions of the present gnomic 
sequences which match a particular target sequence or target motif. A variety of 
known algorithms are disclosed publicly and a variety of commercially available 
software for conducting search means are and can be used in the computer-based 
5 systems of the present invention. Examples of such software includes, but is not 
limited to, MacPattern (EMBL), BLASTN and BLASTX (NCBIA). A skilled artisan 
can readily recognize that any one of the available algorithms or implementing 
software packages for conducting homology searches and/or sequence analyses can be 
adapted for use in the present computer-based systems. 

10 As used herein in connection with sequence searches and analyses, a "target 

sequence" can be any DNA or amino acid sequence of six or more nucleotides or two 
or more amino acids. A skilled artisan can readily recognize that the longer a target 
sequence is, the less likely a target sequence will be present as a random occurrence in 
the database. Also, the target sequence length is preferably selected to include 

1 5 sequence corresponding to a biologically relevant portion of an encoded product, for 
example a region which is expected to be conserved across a range of source 
organisms. Preferably the sequence length of a target polypeptide sequence is from 5- 
100 amino acids, more preferably 7-50 or 7-100 amino acids, and still more preferably 
10-80 or 10-100 amino acids. Preferably the sequence length of a target 

20 polynucleotide sequence is from 1 5-300 nucleotide residues, more preferably from 2 1 - 
240 or 21-300, and still more preferably 30-150 or 30-300 nucleotide residues. 
However, it is well recognized that searches for commercially important fragments, 
such as sequence fragments involved in gene expression and protein processing, may 
be of shorter length. Likewise, it may be desirable to search and/or analyze longer 

25 sequences. 

As used herein, "a target structural motif," or "target motif," refers to any 
rationally selected sequence or combination of sequences in which the sequence(s) are 
chosen based on a three-dimensional configuration which is formed upon the folding 
of the target motif. There are a variety of target motifs known in the art. Protein 
30 target motifs include, but are not limited to, enzymatic active sites and signal 
sequences. Nucleic acid target motifs include, but are not limited to promoter 
sequences, hairpin structures and inducible expression elements (protein binding 
sequences). 

A variety of structural formats for the input and output devices can be used to 
35 input and output the information in the computer-based systems of the present"" 
invention. A preferred format for an output device ranks fragments of the 
bacteriophage or bacterial sequences possessing varying degrees of homology to the 



WO 00/32825 



PCT/IB99/02040 



target sequence or target motif. Such presentation provides a skilled artisan with a 
ranking of sequences which contain various amounts of the target sequence or target 
motif and identifies the degree of homology contained in the identified fragment. 

A variety of comparing methods and/or devices and/or formats can be used to 
5 compare a target sequence or target motif with the sequence stored in data storage 
media to identify sequence fragments of the bacteriophage or bacterium in question. 
One skilled in the art can readily recognize that any one of the publicly available 
homology search programs can be used as the search program for the computer-based 
systems of the present invention. Of course, suitable proprietary systems that may be 

10 known to those of skill, or later developed, also may be employed in this regard. 

Figure 6 provides a block diagram of a computer system illustrative of 
embodiments of this aspect of present invention. The computer system 102 includes a 
processor 106 connected to a bus 104. Also connected to the bus 1 04 are a main 
memory 108 (preferably implemented as random access memory, RAM) and a variety 

15 of secondary storage devices 110, such as a hard drive 1 12 and a removable medium 
storage device 1 14. The removable medium storage devicel 14 may represent, for 
example, a floppy disk drive, a CD-ROM drive, a magnetic tape drive, etc. A 
removable storage medium 116 (such as a floppy disk, a compact disk, a magnetic 
tape, etc.) containing control logic and/or data recorded therein may be inserted into 

20 the removable medium storage device 114. The computer system 102 includes 

appropriate software for reading the control logic and/or the data from the removable 
medium storage device 114, once it is inserted into the removable medium storage 
device 1 14. 

A nucleotide sequence of the present invention may be stored in a well-known 
25 manner in the main memory 108, any of the secondary storage devices 110, and/or a 
removable storage medium 1 16; During execution, software for accessing and 
processing the sequence (such as search tools, comparing tools, etc.) reside in main 
memory 108, in accordance with the requirements and operating parameters of the 
operating system, the hardware system and the software program or programs. 
30 The data storage medium in which the sequence is embodied and the central 

processor need not be part of a single stand-alone computer, but may be separated so 
long as data transfer can occur. For example, the processor or processors being 
utilized for a search or analysis can be part of one general purpose computer, and the 
data storage medium can be part of a second general purpose computer connected to Sl. 
network, or the data storage medium can be part of a network server. As another 
example the data storage medium can be part of a computer system or network 
accessible over telephone lines or other remote connection method. 
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EXAMPLES 

Example 1. Growth of Staph A bacteriophage 77 and purification of genomic 
DNA. 

5 The Staphylococcus aureus propagating strain (PS 77; ATCC #27699) was 

used as a host to propagate its respective phage 77 (ATCC # 27699-B1). Two rounds 
of plaque purification of phage 77 were performed on soft agar essentially as 
described in Sambrook et al (1989). Briefly, the PS 77 strain was grown overnight at 
37°C in Nutrient broth [NB: 0.3% Bacto beef extract, 0.5% Bacto peptone (Difco 
10 Laboratories) and 0.5% NaCl (w/v)].The culture was then diluted 20x in NB and 
incubated at 37°C until the OD S40 = .2 (early log phase) with constant agitation. In 
order to obtain single plaques, phage 77 was subjected to 10-fold serial dilutions using 
phage buffer (1 mM MgS0 4 , 5 mM MgCl 2 , 80 mM NaCI and 0.1% Gelatin (w/v)) and 
10 \i\ of each dilution was used to infect 0.5 ml of the cell suspension in the presence 
15 of 400 (ig/ml CaCl 2 . After incubation of 15 min at room temperature (RT), 2 ml of 
melted soft agar kept at 45°C (NB supplemented with 0.6% agar) was added to the 
mixture and poured onto the surface of 100 mm nutrient agar plates (0.3% Bacto Beef 
extract, 0.5% Bacto peptone, 0.5% NaCl and 1.5% Bacto agar (w/v)). After overnight 
incubation at 30°C, a single plaque was isolated, resuspended in 1 ml of phage buffer 
20 by end over end rotation for 2 hrs at 20°C, and the phage suspension was diluted and 
used for a second infection as described above. After overnight incubation at 30°C, a 
single plaque was isolated and used as a stock. 

The propagation procedure for bacteriophage 77 was modified from the agar 
layer method of SwanstOrm and Adams (1951). Briefly, the PS 77 strain was grown to 
stationary phase overnight at 37°C in Nutrient broth. The culture was then diluted 
twenty-fold in NB and incubated at 37°C until the OD 540 = .2. The suspension (15xl0 7 
Bacteria) was then mixed with 15xl0 5 plaque forming units (pfu) to give a ratio of 
100-bacteria/phage particle in the presence of 400 (ag/ml of CaCl 2 . After incubation 
for 15 min at 20°C, 7.5 ml of melted soft agar (NB plus 0.6% agar) were added to the 
mixture and poured onto the surface of 150 mm nutrient agar plates and incubated 16 
hrs at 30°C. To collect the phage plate lysate, 20 ml of NB were added to each plate 
and the soft agar layer was collected by scrapping off with a clean microscope slide 
followed by shaking of the agar suspension for 5 min to break up the agar. The 
mixture was then centrifuged for 10 min at 4,000 RPM (2,830xg) in a JA-10 rotor-- r 
(Beckman) and the supernatant fluid (lysate) was collected and subjected to~a 
treatment with 10 \ig /ml of DNase I and RNase A for 30 min at 37°C. To precipitate 
the phage particles, the phage suspension was adjusted to 10% (w/v) PEG 8000 and 
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0.5 M of NaCl followed by incubation at 4°C for 16 hrs. The phage was recovered by 
centrifugation at 4,000 rpm (3,500xg) for 20 min at 4°C on a GS-6R table top 
centrifuge (Beckman). The pellet was resuspended with 2 ml of phage buffer (1 mM 
MgS0 4 , 5 mM MgCl 2 , 80 mM NaCl and 0.1% Gelatin). The phage suspension was 
5 extracted with 1 volume of chloroform and further purified by centrifugation on a 
cesium chloride step gradient as described in Sambrook et al. (1989), using a TLS 55 
rotor centrifiiged in an Optima TLX ultracentrifiige (Beckman) for 2 h at 28,000 rpm 
(67,000xg) at 4°C. Banded phage was collected and ultracentrifiiged again on an 
isopycnic cesium chloride gradient (1.45 g/ml) at 40,000 rpm (64,000xg) for 24 h at 

10 4°C using a TLV rotor (Beckman). The phage was harvested and dialyzed for 4 h at 
room temperature against 4 L of dialysis buffer consisting of 10 mM NaCl, 50 mM 
Tris-HCl [pH 8] and 1 0 mM MgCl 2 . Phage DNA was prepared from the phage 
suspension by adding 20 mM EDTA, 50 mg/ml Proteinase K and 0.5% SDS and 
incubating for 1 h at 65°C, followed by successive extractions with 1 volume of 

15 phenol, 1 volume of phenol-chloroform and 1 volume of chloroform. The DNA was 
then dialyzed overnight at 4°C against 4 L of TE (10 mM Tris pH 8.0, ImM EDTA). 

Example 2. DNA sequencing of Bacteriophage 77 genome 

Four micrograms of phage 77 DNA was diluted in 200 ^1 of TE (10 mM Tris, 
20 [pH 8.0], 1 mM EDTA) in a 1.5 ml eppendorf tube and sonication was performed 
(550 Sonic Dismembrator™, Fisher Scientific). Samples were sonicated under an 
amplitude of 3 |im with bursts of 5 s spaced by 15 s cooling in ice/water for 3 to 4 
cycles. The sonicated DNA was then size fractionated by electrophoresis on 1% 
agarose gels utilizing TAE (1 x TAE is: 40 mM Tris-acetate, 1 mM EDTA [pH 8.0]) 
25 as the running buffer. Fractions ranging from 1 to 2 kbp were excised from the 

agarose gel and purified using a commercial DNA extraction system according to the 
instructions of the manufacturer (Qiagen), with a final elution of 50 |il of 1 mM Tris 
(pH 8.5). 

The ends of the sonicated DNA fragments were repaired with a combination of 
30 T4 DNA polymerase and the Klenow fragment of E. coli DNA polymerase I, as 
follows. Reactions were performed in a reaction mixture (final volume, 100 jil) 
containing sonicated phage DNA, 10 mM Tris-HCl [pH 8.0], 50 mM NaCl, 10 mM 
MgCl 2 , 1 mM DTT, 50 >ig/ml BSA, 100 ^M of each dNTP and 15 units of T4 DNA 
polymerase (New England Biolabs) for 20 min at 12°C followed by addition of 12.5 - , 
35 units of Klenow large fragment (New England Biolabs) for 15 min at room- 

temperature. The reaction was stopped by two phenol/chloroform extractions and the 
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DNA was precipitated with ethanol and the final DNA pellet was resuspended in 20 
\il ofH 2 0. 

Blunt-ended DNA fragments were cloned by ligation directly into the Hinc II 
site of pKSII+ vector (New England Biolabs) dephosphorylated by treatment with calf 
5 intestinal alkaline phosphatase (New England Biolabs)-treated pKS 11+ vector 

(Stratagene). A typical ligation reaction contained 1 00 ng of vector DNA, 2 to 5 |il of 
repaired sonicated phage DNA (50-100 ng) in a final volume of 20 \A containing 800 
units of T4 DNA ligase (New England Biolabs) and was incubated overnight at 16°C. 
Transformation and selection of bacterial clones containing recombinant plasmids was 
10 performed in E. coli DH10P according to standard procedures (Sambrook et al., 
1989). 

Recombinant clones were picked from agar plates into 96-well plates 
containing 100 |il LB and 100 ng/ml ampicillin and incubated at 37°C. The presence 
of phage DNA insert was confirmed by PCR amplification using T3 and T7 primers 

1 5 flanking the Hinc U cloning site of the pKS 11+ vector. PCR amplification of foreign 
insert was performed in a 15 |il reaction volume containing 10 mM Tris (pH 8.3), 50 
mM KC1, 1.5 mM MgCl 2 , 0.02% gelatin, 1 \iM primer, 187.5 jiM each dNTP, and 
0.75 units Taq polymerase (BRL). The thermocycling parameters were as follows: 2 
min initial denaturation at 94°C for 2 min, followed by 20 cycles of 30 sec 

20 denaturation at 94°C, 30 sec annealing at 57°C, and 2 min extension at 72°C, 

followed by a single extension step at 72°C for 10 min. Clones with insert sizes of 1 
to 2 kbp were selected and plasmid DNA was prepared from the selected clones using 
QIAprep™ spin miniprep kit (Qiagen). 

The nucleotide sequence of the extremities of each recombinant clone was 

25 determined using an ABI 377-36 automated sequencer with two types of chemistry: 
ABI prism Big Dye™ primer or ABI prism Big Dye™ terminator cycle sequencing 
ready reaction kit (Applied Biosystems). To ensure co-linearity of the sequence data 
and the genome, all regions of phage genome were sequenced at least once from both 
directions on two separate clones. In areas that this criteria was not initially met, a 

30 sequencing primer was selected and phage DNA was used directly as sequencing 

template employing ABI prism Big Dye™ terminator cycle sequencing ready reaction 
kit. 

Example 3. Bioinformatic management of primary nucleotide sequence from 
35 Phage 77. " ^ * 

Phage 77 sequence contigs were assembled using Sequencher™ 3.1 software 
(GeneCodes). To close contig gaps, sequencing primers were selected near the edge of 
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the contigs. Phage DNA was used directly as sequencing template employing ABI 
prism BIG DYE™ terminator cycle sequencing ready reaction kit. The complete 
sequence of bacteriophage 77 is shown in Table 2. 

A software program was developed and used on the assembled sequence of 
5 bacteriophage 77 to identify all putative ORFs larger than 33 codons. Other ORF 
identification software can also be utilized, preferably programs which allow 
alternative start codons. The software scans the primary nucleotide sequence starting 
at nucleotide #1 for an appropriate start codon. Three possible selections can be made 
for defining the nature of the start codon; I) selection of ATG, II) selection of ATG or 

1 0 GTG, and III) selection of either ATG, GTG, TTG, CTG, ATT, ATC, and ATA. This 
latter initiation codon set corresponds to the one reported by the NCBI 
nittD://www.ncbi.nhn.nih.eov/htbin-post/Taxonomv/wprintgc?mode=c > > for the 
bacterial genetic code. 

When an appropriate start codon is encountered, a counting mechanism is 

1 5 employed to count the number of codons (groups of three nucleotides) between this 
start codon and the next stop codon downstream of it. If a threshold value of 33 is 
reached, or exceeded, then the sequence encompassed by these two codons (start and 
stop codons) is defined as an ORF. This procedure is repeated, each time starting at 
the next nucleotide following the previous stop codon found, in order to identify all 

20 the other putative ORFs. The scan is performed on all three reading frames of both 
DNA strands of the phage sequence. 

Sequence homology (BLAST) searches for each ORF are then carried out 
using an implementation of BLAST programs, although any of a variety of different 
sequence comparison and matching programs can be utilized as known to those 

25 skilled in the art. Downloaded public databases used for sequence analysis include: 

i) non-redundant GenBank (ftp://ncbi.nlm.nih.gOv/blast/db/nr.Z), 

ii) Swissprot (ftp://ncbi.nlm.nih.govftlast/db/swissprot.Z)^ 

iii) vector (ftp://ncbi.nlm.nih.gov^last/db/vector.Z); 

iv) pdbaa databases (ftp://ncbi.nlm.nih.gOv/blast/db/pdbaa.Z); 

30 v) S. aureus NCTC 8325 (ftp://ftp.genome.ou.edu/pub/staph/staph-lk.fa); 

vi) streptococcus pyogenes (ftp://ftp.genome.ou.edu/pub/strep/strep-lk.fa); 

vii) Streptococcus pneumoniae 

(ftp://ftp.tigr.org/pub/data/s_pneumoniae/gsp.contigs. 1121 97.Z); 

viii) Mycobacterium tuberculosis CSU#9 

35 (ftp://ftp.tigr.Org/pub/data/m_tuberculosis/TB_091097.Z) and 

ix) pseudomonas aeruginosa (http://www.genome.washington.edu/pseudo/data.htmn . 
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The results of the homology searches performed on the ORFs is shown in 
TableS. 

Example 4. Subcloning of Bacteriophage 77 ORFs into a Staph A inducible 
5 expression system. 

The shuttle vector pT0021, in which the firefly luciferase (lucFF) expression 
is controlled by the ars (arsenite) promoter/operator (Tauriainen et al., 1997), was 
modified in the following fashion. Two oligonucleotides corresponding to a short 
antigenic peptide derived from the heamaglutinin protein of influenza virus (HA 
10 epitope tag) were synthesized (Field et al., 1988). The sense strand HA tag sequence 
(with BamHl, Sail and Hindlll cloning sites) is: 

5 '-gatcccggtcgaccaagcttTACCCATACGACGTCCCAGACTACGCCAGCTGA-3 ' 
(where upper case letters denote the nucletotide sequence of the HA tag); the antisense 
strand HA tag sequence (with a Hindlll cloning site) is: 

15 5 ' -agctTCAGCTGGCGTAGTCTGGGACGTCGTATGGGTAaagcttggtcgaccgg-3 9 
(where upper case letters denote the sequence of the HA tag). The two HA tag 
oligonucleotides were annealed and ligated into pT0021 vector which had been 
digested with BamHl and Hindm. This manipulation resulted in replacement of the 
lucFF gene by the HA tag. This modified shuttle vector containing the arsenite 

20 inducible promoter, the arsR gene, and HA tag was named pTHA. A diagram 
outlining our modification of pT0021 to generate pTHA is shown^in Fig. 1 A. 

Each ORF, encoded by Bacteriophage 77, larger than 33 amino acids and 
having a Shine-Dalgamo sequence upstream of the initiation codon was selected for 
functional analysis for bacterial inhibition. In total, 98 ORFs were selected and 

25 screened as detailed below. A list of these is presented in Table 3. Each individual 
ORF, from initiation codon to last codon (excluding the stop codon), was amplified 
from phage genomic DNA using the polymerase chain reaction (PCR). For PCR 
amplification of ORFs, each sense strand primer targets the initiation codon and is 
preceded by a BamHl restriction site ( 5 cgggatcc 3 ) and each antisense oligonucleotide 

30 targets the pentultimate codon (the one before the stop codon) of the ORF and is 

preceded by a Sal I restriction site ( 5 gcgtcgaccg 3 ). The PCR product of each ORF was 
gel purified and digested with BamHl and Sail. The digested PCR product was then 
gel purified using the Qiagen kit as described, ligated into BamHl and Sail digested 
pTHA vector, and used to transform E. coli bacterial strain DH10P(as described _ 

35 above). As a result of this manipulation, the HA tag is set inframe with the ORF and is 
positioned at the carboxy terminus of each ORF (pTHA/ORF clones). Recombinant 
pTHA/ORF clones were picked and their insert sizes were confirmed by PCR analysis 



WO 00/32825 



PCT/IB99/02040 



81 

using primers flanking the cloning site. The names and sequences of the primers that 
were used for the PCR amplification were: HAF: 

5 TATTATCCAAAACTTGAACA 3 '; HAR: 5 CGGTGGTATATCCAGTGATT 3 '. The 
sequence integrity of cloned ORFs was verified directly by DNA sequencing using 
5 primers HAF and HAR. In cases where verification of ORF sequence could not be 
achieved by one pass with the sequencing primers, additional internal primers were 
selected and used for sequencing. 

Staphylococcus aureus strain RN4220 (Kreiswirth et al., 1983) was used as a 
recipient for the expression of recombinant plasmids. Electoporation was performed 

10 essentially as previously described (Schenk and Laddaga, 1992). Selection of 

recombinant clones was performed on Luria-Broth agar (LB-agar) plates containing 
30 ng/ml of kanamycin. 

For each ORF introduced in the pTHA plasmid, 3 independent transformants 
were isolated and used to individually inoculate cultures in 5 ml of TSB containing 

1 5 30|ag/ml kanamycin, followed by growth to saturation (1 6 hrs at 30°C). An aliquot of 
this stationary phase culture was used to generate a frozen glycerol stock of the 
transforraant ( stored at - 80°C). The remaining culture was used for plasmid DNA 
extraction. Bacterial cells were harvested by centrifugation at 3000 x g at 22°C for 5 
min. The pellet was resuspended in 200 fal 25% sucrose containing 25U/ml of 

20 lysostaphin and incubated for 15 min at 37°C. Then, 400^1 of alkaline SDS solution 
(3% SDS, 0.2N NaOH) were added, well mixed and incubated for 7 min at room 
temperature. After the alkaline SDS treatment, 300^.1 of ice-cold 3M sodium acetate 
pH 4.8 were added, and the mix is immediately spun at 13000g for 15 min at room 
temperature. The supernatant was transferred to a new 1.5 ml conical centrifuge tube 

25 and 650^1 of isopropanol (stored at room temperature) were added. The mix was then 
centrifiiged at 13,000 x g for 5 min. The supernatant fluid was discarded, the pellet 
washed with 70% ethanol, and resuspended in 320 \i\ sterile distilled water. 

The presence of individual phage 77 ORF DNA inserts in the plasmid was 
verified by PCR amplification using 1.5 ^1 transformant miniprep DNA in a PCR 

30 with primers flanking the cloning site of ORF in pTHA vector (HAF and HAR). The 
composition of the PCR reaction and the cycling parameters are identical to those 
employed for library screening described above. 

Example 5. Functional assay for bacterial inhibitory activity of bacteriophage 77 
35 ORFs. " ^ * 

The anti-microbial activity of individual phage 77 ORFs was monitored by 
two growth inhibitory assays, one on solid agar medium, the other in liquid medium. 
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In general, Staphylococcus bacteria transformed with expression plasmids containing 
individual ORFs were grown in normal TSA medium and stored in 19% glycerol. At 
pre-determined times, arsenite was added to the culture to induce transcription of the 
phage 77 ORFs cloned immediately downstream from an arsenite-inducible promoter 
5 in the pTHA expression plasmid. 

The effect of ORF induction on bacterial growth characteristics was then 
monitored and quantitated. The growth inhibition assay on solid medium was 
performed by streaking pTHA/ORF containing S. aureus transformant onto LB-Kn 
and TSA-Kn plates containing increasing concentrations of sodium arsenite (0; 2.5; 5; 

10 and 7.5 ^M). Arsenite is used to induce the expression of cloned DNA in pTHA 
vector. In parallel, 3 jil of 1/10 and 1/100 dilutions of the frozen cultures of the 
pTHA/ORF transformants were spotted as single drops onto LB-Kn and TSA-Kn 
plates containing increasing concentration of sodium arsenite (0; 2.5; 5; and 7.5 (iM). 
The plates were then incubated 16 hrs at 37°C, and the effect of arsenite-induced ORF 

15 expression on bacterial growth was monitored and quantitated by comparing the 
extent to that seen in control plates. As positive controls for growth inhibition,the 
holin/lysin genes of the Staphylococcus aureus phage Twort (Loessner et al., 1998) 
was subcloned into the pTHA ars inducible vector and used. 

For the growth inhibition assay in liquid medium, stationary phase cultures 

20 were prepared by inoculating 2.5ml TSB-Kn with frozen S. aureus RN4220 
transformants containing phage 77 ORFs cloned in pTHA vector followed by 
incubation for 16 hrs at 37°C. These cultures were then diluted 1/100 in the same 
medium, and the bacteria were allowed to grow for 2 hrs at 37°C to reach early log 
phase. 150 |il of such culture were then mixed with 2.35 ml TSB-Kn medium with or 

25 without arsenite (the final concentration of arsenite in the medium was 0 or 5 |iM 
arsenite). After 3.5 hrs incubation at 37°C with shaking at 250 rpm, 100 ^1 of 
bacterial culture was removed from each tube for OD 565 measurement. Serial ten-fold 
dilutions of the culture in buffered saline solution (0.85% NaCl) were then spotted 
onto TSB-Kn plates. The plates were incubated at 37°C 16 hrs and the number of 

30 surviving colonies counted the following day. The growth inhibitory property of 
individual ORFs was then quantitated by comparing CFU numbers under normal or 
arsenite-induction conditions. A schematic flow of the inhibition analysis is shown in 
Fig. 3 (also applicable to inhibition analysis for the other phage and bacteria pointed 
out herein). Inhibition results are shown in Figures 4A-C. 

35 " ~" 

Example 6: Itentification of Cecropin Signature Motif in Staphylococcus aureus 

Bacteriophage 3A ORF 
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The genome for S. aureus bacteriophage 3 A was determined and the sequence 
was analyzed essentially as described for bacteriophage 77 in the examples above. 
Upon blast analysis of the identified open reading frames of phage 3 A, the presence of 
an amino acid sequence corresponding to a cecropin signature motif was observed. 
5 This motif ( WDGHKTLEK) is located at position aa 48 1 -489. Cecropins were 
originally identified in proteins from the cecropia moth and are recognized as potent 
antibacterial proteins that constitute an important part of the cell-free immunity of 
insects. Cecropins are small proteins (31-39 amino acid residues) that are active 
against both Gram-positive and Gram-negative bacteria by disrupting the bacterial 
10 membranes. Although the mechanisms by which the cecropons cause cell death are 
not fully understood, it is generally thought to involve channel formation and 
membrane destabilization. 

The identification of a motif corresponding to a known inhibitor suggests that 
the product of ORF002 is also an inhibitory compound. Such inhibitory activity can 
15 be confirmed as described herein or by other methods known in the art. Confirmation 
of the inhibitory activity would indicate that the ORF product could serve as the basis 
for construction of mimetic compounds and other inhibitors directed to the target of 
the ORF002 product. 

Boman & Hultmark, 1987, Ann. Rev. Microbiol 41:103-126. 
20 Boman, 1991, Cell 65:205-207. 

Boman et al., 1991, Eur. J. Bioichem. 201:23-31. 

Wang et al., J. Biol Chem. 273:27438-27448. 

Example 7. Growth of Staphylococcus aureus bacteriophage 44AHJD: 
25 Staphylococcus aureus propagating strain (PS 44A) (Felix d'Herelle Reference 

Centre #HER 1 101) was used as a host to propagate its respective phage 44AHJD 
(Felix d'Herelle Reference Centre #HER 101). Two rounds of plaque purification of 
phage 44AHJD were performed on soft agar essentially as described in Sambrook et 
aL (1989). Briefly, the Staphylococcus aureus PS strain was grown overnight at 37°C 
30 in Nutrient Broth [NB: 3 g Bacto Beef Extract, 5 g Bactopeptone per liter, (Difco 
Laboratories # 0003-17-8), supplemented with 0.5% NaCl]. The culture was then 
diluted 20 fold in NB and incubated at 37°C until an OD 540 of 0.2. In order to obtain 
single plaques, phage 44AHJD was subjected to 10-fold serial dilutions using the 
phage buffer (1 mM MgS0 4 , 5 mM MgCl 2 , 80 mM NaCl and 0.1% Gelatin) anJIO \il 
35 were used to infect 0.5 ml of the cell suspension in the presence of 400 jig/ml of 
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CaCl 2 . After incubation of 15 min at room temperature, 2 ml of melted soft agar (NB 
supplemented with 0.6% of agar) were added to the mixture and poured onto the 
surface of 100 mm nutrient agar plates (3 g Bacto Beef extract, 5 g Bactopeptone, 
0.5% NaCl and 15 g of Bacto agar per liter (Difco Laboratories # 0001-17-0). After 
5 overnight incubation at 37°C, a single plaque was isolated, resuspended in 1ml of 
phage buffer by end over end rotation for 2 h at room temperature and the phage 
suspension was diluted and used for a second infection as described above. After 
overnight incubation at 37°C, a single plaque was isolated and used as a stock. 

Large scale purification of bacteriophage and preparation of phage DNA was 
10 as follows. 

The propagation method was carried out by using the agar layer method 
described by Swanstorm and Adams (1951). Briefly, the PS 44 A strain was grown to 
stationary phase overnight at 37°C in Nutrient Broth. The culture was then diluted 20x 
in NB and incubated at 37°C until the Aj^ 0.2. The suspension (15xl0 7 Bacteria) 

15 was then mixed with 15x10 s phage particles to give a ratio of 100-bacteria/phage 
particle in the presence of 400 [ig/ml of CaCl 2 . After incubation of 15 min at room 
temperature, 7.5 ml of melted soft agar were added to the mixture and poured onto the 
surface of 150 mm nutrient agar plates and incubated overnight at 37°C. To collect the 
Iysate, 20 ml of NB were added to each plate and the soft agar layer was collected by 

20 scrapping off with a clean microscope slide and shaken vigorously for 5 min to break 
up the agar. The mixture was then centrifuged for 10 min at 4,000 rpm (2,830 xg) 
using a JA-10 rotor (Beckman) and the supernatant (lysate) is collected and subjected 
to a treatment with 10 |ig/ml of DNase I and RNase A for 30 min at 37°C. To 
precipitate the phage particles, 10% (w/v) of PEG 8000 and 0.5 M of NaCl were 

25 added to the lysate and the mixture was incubated on ice for 16 h. The phage was 
recovered by centrifugation at 4,000 rpm (3,500 xg) for 20 min at 4°C on a GS-6R 
table top centrifuge (Beckman). 

The pellet was resuspended with 2 ml of phage buffer (1 mM MgS0 4 , 5 mM 
MgCl 2 , 80 mM NaCl and 0.1% Gelatin). The phage suspension was extracted with 1 

30 volume of chloroform and further purified by centrifugation on a preformed cesium 
chloride step gradient as described in Sambrook et aL (1989), using a TLS 55Lrolor 
and centrifuged in an Optima TLX ultracentrifuge (Beckman) for 2 h at 28,000 rpm 
(67,000 xg) at 4°C. Banded phage was collected and ultracentrifuged again on an 
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isopycnic cesium chloride gradient (1.45 g/ml) at 40,000 rpm (64,000 x g) for 24 h at 
4°C using a TLV rotor (Beckman). The phage was harvested and dialyzed for 4 h at 
room temperature against 4 L of dialysis buffer consisting of 1 0 mM NaCl, 50 mM 
Tris-HCl [pH 8] and 10 mM MgCl 2 . Phage DNA was prepared from the phage 
5 suspension by adding 20 mM EDTA, 50 ng/ml Proteinase K and 0.5% SDS and 
incubating for 1 h at 65°C, followed by successive extractions with 1 volume of 
phenol, 1 volume of phenol-chloroform and 1 volume of chloroform. The DNA was 
then dialyzed overnight at 4°C against 4 L of TE (10 mM Tris-HCl [pH 8.0], ImM 
EDTA). 

10 

Example 8. DNA sequencing of the Bacteriophage 44 AHJD genome. 

Four mg of phage DNA was diluted in 200 ^1 of TE pH 8.0 in a 1 .5 ml 
eppendorf tube and sonication was performed (550 Sonic Dismembrator, Fisher 
Scientific). Samples were sonicated under an amplitude of 3 jim with bursts of 5 s 

15 spaced by 15 s cooling in ice/water for 3 to 4 cycles and size fractionated on 1% 
agarose gels. The sonicated DNA was then size fractionated by gel electrophoresis. 
Fractions ranging from 1 to 2 kbp were excised from the agarose gel and purified 
using a coommercial DNA extraction system according to the instructions of the 
manufacturer (Qiagen) and eluted in 50 |il of ImMTris-HCl [ pH 8.5]. 

20 The ends of the sonicated DNA fragments were repaired with a combination of 

T4 DNA polymearse and the Klenow fragment of E. coli DNA polymerase 1 as 
follows. Reactions were performed in a final volume of 100 \xl containing DNA, 10 
mM Tris-HCl pH 8.0, 50 mM NaCl, 10 mM MgCl 2 , 1 mM DTT, 5 jig BSA, 100 \xM 
of each dNTP and 15 units of T4 DNA polymerase (New England Biolabs) for 20 min 

25 at 12°C followed by addition of 12.5 units of Klenow fragment (New England 
Biolabs) for 15 min at room temperature. The reaction was stopped by two 
phenol/chloroform extractions and the DNA was ethanol precipitated and resuspended 
in 20 |xl ofH 2 0. 

Cloning of the sonicated phage DNA into pKSII vector and transformation: 
30 Blunt-ended DNA fragments were cloned by ligation directly into the- J/wcDL "* 

site of the pkSII vector (Stratagene) dephosphorylated with calf intestinal alkaline 
phosphatase (New England Biolabs). A typical reaction contained 100 ng of vector, 2 
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to 5 pi of repaired sonicated phage DNA (50-100 ng) in a final volume of 20 nl 
containing 800 units of T4 DNA ligase (New England Biolabs) overnight at 16°C. 
Transformation and selection of positive clones was performed in the host strain 
DH10 p of E. coli using ampicillin as a selective antibiotic as described in Sambrook 
5 etal (1989). . 

Recombinant clones were picked from agar plates into 96-well plates 
containing 100 ml LB and 100 ng/ml ampicillin and incubated at 37°C. The presence 
of phage DNA insert was confirmed by PCR amplification using T3 and T7 primers 
flanking the HincU cloning site of the pKS vector. PCR amplification of the potential 

10 foreign inserts was performed in a 15 ^1 reaction volume containing 10 mM Tris-HCl 
(pH 8.3), 50 mM KC1, 1.5 mM MgCl 2 , 0.02% gelatin, 1 mM primer, 187.5 \iM each 
dNTP, and 0.75 units Tag polymerase (BRL). The thermocycling parameters were as 
follows: 2 min initial denaturation at 94°C for 2 min, followed by 20 cycles of 30 sec 
denaturation at 94°C, 30 sec annealing at 58C, and 2 min extension at 72°C, followed 

15 by a single extension step at 72°C for 10 min. Clones with insert sizes of 1 to 2 kbp 
were selected and plasmid DNA was prepared from the selected clones using the 
QIAprep™ spin miniprep kit (Qiagen). 

The nucleotide sequence of the extremities of each recombinant clone was determined 
using an ABI 377-36 automated sequencer with two types of chemistry: ABI prism 

20 BigDye™ primer cycle sequencing (21M13 primer: #403055)(M13REV primer: 
#403056) or ABI prism BigDye™ terminator cycle sequencing ready reaction kit 
(Applied Biosystems; #4303152). To ensure co-linearity of the sequence data and the 
genome, all regions of the phage genome were sequenced at least once from both 
directions on two separate clones. In areas that this criteria was not initially met, a 

25 sequencing primer was selected and phage DNA was used directly as sequencing 

template employing ABI prism BigDye™ terminator cycle sequencing ready reaction 
kit. 

Example 9, Bioinformatic management of primary nucleotide sequence. 
30 Sequence contigs were assembled using Sequencher™ 3.1 software * ~ 

(GeneCodes). To close contig gaps, sequencing primers were selected near the edge of 
the contigs. Phage DNA was used directly as sequencing template employing ABI 
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prism BigDye™ terminator cycle sequencing ready reaction kit (Applied Biosystems; 
#4303152). The complete sequence of Staphylococcus aureus bacteriophage 44AHJD 
is shown in Table 16. 

A software program was used on the assembled sequence of bacteriophage 
5 44AHJD to identify all putative ORFs larger than 33 codons. The software scans the 
primary nucleotide sequence starting at nucleotide #1 for an appropriate start codon. 
Three possible selections can be made for defining the nature of the start codon; I) 
selection of ATG, II) selection of ATG or GTG, and IE) selection of either ATG, 
GTG, TTG, CTG, ATT, ATC, and ATA. This latter initiation codon set corresponds 

10 to the one reported by the NCBI flittp://www.ncbi.nlm.nih.gov/htbin- 

post/Taxonomv/wprintgc?mode=c) for the bacterial genetic code. When an 
appropriate start codon is encountered, a counting mechanism is employed to count 
the number of codons (groups of three nucleotides) between this start codon and the 
next stop codon downstream of it. If a threshold value of 33 is reached, or exceeded, 

15 then the sequence encompassed by these two codons is defined as an ORF. This 

procedure is repeated, each time starting at the next nucleotide following the previous 
stop codon found, in order to identify all the other putative ORFs. The scan is 
performed on all three reading frames of both DNA strands of the phage sequence. 
The predicted ORFs for bacteriophage 44AHJD are listed in Tables 17 & 18. 

20 Sequence homology searches for each ORF were carried out using an 

implementation of blast programs. Downloaded public databases used for sequence 
analysis include: 

(i) non-redundant GenBank (ftp://ncbi.nlm.nih.gOv/blast/db/nr.Z), 
ii) Swissprot (ftp://ncbi.nlm.nih.gov^last/db/swissprot.Z); 
25 iii) vector (ftp://ncbi.nlm.nih.gOv/blast/db/vector.Z); 

iv) pdbaa databases (ftp://ncbi.nlm.nih.gOv/blast/db/pdbaa.Z); 

v) Staphylococcus aureus NCTC 8325 (ftp://ftp.genome.ou.edu/pub/staph/staph- 
lk.fa); 

vi) Staphylococcuspyogenes(ftp://RpAigr.org/pu^ 1121 
30 97.Z); 

vii) PRODOM(ftp://ftp.toulouse.inra.fr/pub/prod 
astgz); 

viii) DOMO (ftp://ftp.infobiogen.fr/pub/db/domo/); 
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ix) TREMBL (ftp://www.expasy.ch/databases/sp_tr_nrdb/fasta/) 

The results of the homology searches performed on the ORFs of bacteriophage 
44AHJD are shown in Tables 19 & 20. 

Example 10. Sub-Cloning of Bacteriophage 44 AHJD ORFs. 

Expression preferably utilizes a shuttle expression vector which is arranged 
such that expression of the exogenous bacteriophage 44 AHJD ORF sequence is 
inducible. For example, the shuttle vector pT0021, in which the firefly luciferase 
(lucFF) expression is controlled by the ars (arsenite) promoter/operator (Tauriainen et 
aL, 1997), can be modified in the following fashion. Two oligonucleotides 
corresponding to a short antigenic peptide derived from the heamaglutinin protein of 
influenza virus (HA epitope tag) were synthesized (Field et al., 1988). The sense 
strand HA tag sequence (with BamHl, Sail and Hindlll cloning sites) is: 
5 '-gatcccggtcgaccaagcttTACCCATACGACGTCCCAGACTACGCC AGCTGA-3 * 
(where upper case letters denote the nucletotide sequence of the HA tag); the antisense 
strand HA tag sequence (with a HindUl cloning site) is: 

5 ' -agctTCAGCTGGCGTAGTCTGGGACGTCGTATGGGTAaagcttggtcgaccgg-3 * 
(where upper case letters denote the sequence of the HA tag). The two HA tag 
oligonucleotides were annealed and ligated into pT0021 vector which had been 
digested with BamHl and HincfiR. This manipulation resulted in replacement of the 
lucFF gene by the HA tag. This modified shuttle vector containing the arsenite 
inducible promoter, the arsR gene, and HA tag was named pTHA. A diagram 
outlining our modification of pT0021 to generate pTHA is shown in Fig. 1 A (another 
userful vector construct is shown in Fig. IB). 

Each ORF, encoded by Bacteriophage 44 AHJD, larger than 33 amino acids 
and having a Shine-Dalgarno sequence upstream of the initiation codon can be 
selected for functional analysis for bacterial inhibition. Each individual ORF, from 
initiation codon to last codon (excluding the stop codon), can be amplified from phage 
genomic DNA using the polymerase chain reaction (PGR). For PCR amplification of 
ORFs, each sense strand primer targets the initiation codon and is preceded by a 
BamHl restriction site ( 5 cgggatcc 3 ) and each antisense oligonucleotide targets the~~ * 
pentultimate codon (the one before the stop codon) of the ORF and is preceded by a 
Sal I restriction site ( 5 gcgtcgaccg 3 ). The PCR product of each ORF can be gel 
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gel purified using the Qiagen kit as described, ligated into BamHl and Sail digested 
pTHA vector, and used to transform E. coli bacterial strain DH10p(as described 
above). As a result of this manipulation, the HA tag is set inframe with the ORF and is 
5 positioned at the carboxy terminus of each ORF (pTHA/ORF clones). Recombinant 
pTHA/ORF clones will be picked and their insert sizes were confirmed by PCR 
analysis using primers flanking the cloning site. The following primers can be used 
for PCR amplification: HAF: ^TATTATCCAAAACTTGAACA 3 '; HAR: 
5 CGGTGGTATATCCAGTGATT 3 '. The sequence integrity of cloned ORFs can be 

10 verified directly by DNA sequencing using primers HAF and HAR. In cases where 
verification of ORF sequence can not be achieved by one pass with the sequencing 
primers, additional internal primers will be selected and used for sequencing. 

Staphylococcus aureus strain RN4220 (Kreiswirth et al., 1983) will be used as 
a recipient for the expression of recombinant plasmids. Electoporation will be 

15 performed essentially as previously described (Schenk and Laddaga, 1992). Selection 
of recombinant clones will be performed on Luria-Broth agar (LB-agar) plates 
containing 30 |ig/ml of kanamycin. 

Alternatively, a constitutive promoter can be used to drive expression of the 
introduced ORF, and compare cell growth to control bacterial cells containing the 

20 parental vector lacking any introduced phage ORF. Recombinant plasmids will be 
introduced into Staphylococcus aureus strain RN4220 (Kreiswirth et al., 1983) using 
electoporation as previously described (Schenk and Laddaga, 1992). 
Cloning of ORFs with a Shine-Dalgarno sequence 

ORFs with a Shine-Dalgamo sequence are selected for functional analysis of 

25 bacterial killing. Each ORF, from initiation codon to last codon (excluding the stop 
codon), can be amplified by PCR from phage genomic DNA. For PCR amplification 
of ORFs, each sense strand primer starts at the initiation codon and is preceded by a 
restriction site and each antisense strand starts at the last codon (excluding the stop 
codon) and is preceded by a different restriction site. The PCR product of each ORF 

30 will be gel purified and digested with the restriction enzymes with sites contained on 
the PCR oligonucleotides. The digested PCR product is then gel purified using-the ~ 
Qiagen kit, ligated into the modified shuttle vector, and used to transform bacterial 
strain DH10. Recombinant clones are then picked and their insert sizes confirmed by 
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PCR analysis using primers flanking the cloning site as well as restriction digestion. 
The sequence fidelity of cloned ORFs can be verified by DNA sequencing using the 
same primers as used for PCR. In the cases that the verification of ORFs can not be 
achieved by one path of sequencing using primers flanking the cloning site internal 
5 primers can be selected and used for sequencing. Recombinant plasmids can be 

introduced into Staphylococcus aureus strain RN4220 (Kreiswirth et al., 1983) using 
electoporation as previously described (Schenk and Laddaga, 1992). 
Induction of gene expression from the ars promoter. 

If an inducible promoter is used, e.g., the ars promoter, induction can be 
1 0 assessed, for example, in either of the two methods. 

1. Screening on agar plates 

The functional identification of killer ORFs can be performed by spreading an aliquot 
of S. aureus transformed cells containing phage 44 AHJD ORFs onto agar plates 
containing different concentrations of sodium arsenite (0; 2.5; 5; and 7.5 \xM). The 
15 plates are incubated overnight at 37°C, after which a growth inhibition of the ORF 
transformants on plates that contain arsenite are compared to plates without arsenite. 

2. Quantification of growth inhibition in liquid medium 

Cells containing different recombinant plasmids can be grown for overnight at 
37°C in LB medium supplemented with the appropriate antibiotic selection. These are 

20 then diluted to the mid log phase (OD^. 2) with fresh media containing antibiotic 
and transferred to 96-well microtitration plates (100 (il/well). Inducer is then added at 
different final concentrations (ranging from 2.5 to 10 ^M) and the culture incubated 
for an additional 2 hrs at 37°C. The effect of expression of the phage 44 AHJD ORFs 
on bacterial cell growth is then monitored by measuring the OD 540 and comparing the 

25 rate of growth to the culture not containing inducer. [As positive controls for growth 
inhibition, the MA gene of phage lambda (Reisinger, GR., Rietsch, A., Lubitz, W. and 
Blasi, U. 1993 Virology #193: 1033-1036), and the holin/lysin genes of the 
Staphylococcus aureus phage Twort (Loessner, MJ., Gaeng, S., Wendlinger, G., 
Maier, SK. and Scherer, S. 1998. FEMS Microbiology Letters #162:265-274) can be 

30 subcloned into the ars inducible vector. An aliquot of the induced and uninduced. . 
culture can also be plated out on agar plates containing an appropriate antibiotic- 
selection but lacking inducer. Following incubation overnight at 37°C, the number of 
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colonies is counted. Any ORF showing bacteriostatic activity will show a lower, but 
detectable, number of colonies on the agar plates when grown in the presence of 
inducer as compared to when grown in the absence of inducer. Any ORF showing 
full bacteriocidal activity will show no colonies on the agar plates, when grown in the 
5 presence of inducer as compared to when grown in the absence of inducer. 
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Example 11. Growth of Enterococcus bacteriophage 182 and purification of 
genomic DNA . 

The Enterococcus propagating strain (PS) {Enterococcus sp. Group D, Felix 
d'Herelle Reference Centre #HER 1080) was used as host to propagate its respective 

1 0 phage 1 82 (Felix d'Herelle Reference Centre #HER 80). Two rounds of plaque 
purification of phage 182 were performed on soft agar essentially as described in 
Sambrook et ah (1989). Briefly, the Enterococcus sp. PS strain was grown overnight 
at 37°C in Tryptic Soy Broth [TSB: 17 g Bacto tryptone, 3 g Bacto soytone, 2.5 g 
Bacto dextrose, 5 g Sodium chloride, and 2.5 g Dipotassium phosphate per liter 

15 (Difco Laboratories (#0370-17-3)]. The culture was then diluted 20 fold in TSB and 
incubated at 37°C until the OD 540 = 0.2 (early log phase) with constant agitation. In 
order to obtain single plaques, phage 182 was subjected to 10 fold serial dilutions 
using the phage buffer (1 mM MgS0 4 , 5 mM MgCl 2 , 80 mM NaCl and 0.1% Gelatin 
(w/v)) and 10 1 of each dilution was used to infect 0.5 ml of the bacterial cell 

20 suspension. After incubation at 15 min at 37°C, 2 ml of melted soft agar (TSB 

supplemented with 0.6% agar) was added to the mixture and poured onto the surface 
of 100 mm Trytic Soy Agar plates [TSA: 15 g Tryptone peptone, 5 g Soytone 
peptone, 5 g Sodium chloride and 15 g of Agar per liter (Difco Laboratories #0369- 
17)]. After overnight incubation at 37°C, a single plaque was isolated, resuspended in 

25 1 ml of phage buffer by end over end rotation for 2 hrs at room temperature, and the 
phage suspension was diluted and used for a second infection as described above. 
After overnight incubation at 37°C, a single plaque was isolated and used as a stock 
for all subsequent manipulations. 

The propagation procedure for bacteriophage 182 was modified from the agar 

30 layer method of Swanstorm and Adams (1951). Briefly, the Enterococcus sp. PS 

strain was grown to stationary phase overnight at 37°C in TSB. The culture was then_ - ? 
diluted 20 fold in TSB and incubated at 37°C until the A 540 = 0.2. The suspension 
(15xl0 7 Bacteria) was then mixed with 15xl0 5 plaque forming units (pfu) to give a 
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ratio of 100-bacteria/pfii. After incubation of 15 min at 37°C, 7.5 ml of melted soft 
agar (TSB plus 0.6% agar) were added to the mixture and poured onto the surface of 
150 mm TSA plates and incubated 16 hrs at 37°C. To collect the plate lysate, 20 ml 
of TSB were added to each plate and the soft agar layer was collected by scrapping off 
5 with a clean microscope slide followed by vigorous shaking of the agar suspension for 
5 min to break up the agar. The mixture was then centrifuged for 10 min at 4,000 rpm 
(2,830 xg) using a JA-10 rotor (Beckman) and the supernatant fluid (lysate) is 
collected and subjected to a treatment with 10 |ig /ml of DNase I and RNase A for 30 
min at 37°C. To precipitate the phage particles, the phage suspension was adjusted to 

10 10% (w/v) of PEG 8000 and 0.5 M of NaCl followed by incubation at 4°C for 16 hrs. 
The phage was recovered by centrifugation at 4,000 rpm (3,500 xg) for 20 min at 4°C 
on a GS-6R table top centrifuge (Beckman). The pellet was resuspended with 2 ml of 
phage buffer (1 mM MgS0 4 , 5 raM MgCl 2 , 80 mM NaCl and 0.1% Gelatin). The 
phage suspension was extracted with 1 volume of chloroform and further purified by 

15 centrifugation on a cesium chloride step gradient as described in Sambrook et al 
(1989), using a TLS 55 rotor and centrifuged in an Optima TLX ultracentrifuge 
(Beckman) for 2 hrs at 28,000 rpm (67,000 xg) at 4°C. Banded phage was collected 
and ultracentrifuged again on an isopycnic cesium chloride gradient (1.45 g/ml) at 
40,000 rpm (64,000 xg) for 24 hrs at 4°C using a TLV rotor (Beckman). The phages 

20 were harvested and dialyzed for 4 hrs at room temperature against 4 L of dialysis 
buffer consisting of 10 mM NaCl, 50 mM Tris-HCl [pH 8] and 10 mM MgCl 2 . Phage 
DNA was prepared from the phage suspension by adding 20 mM EDTA, 50 g/ml 
Proteinase K and 0.5% SDS and incubating for 1 hr at 65°C, followed by successive 
extractions with 1 volume of phenol, 1 volume of phenol-chloroform and 1 volume of 

25 chloroform. The DNA was then dialyzed overnight at 4°C against 4 L of TE (10 mM 
Tris-HCl [pH 8.0], ImM EDTA). 

Example 1 2. DNA sequencing of the Bacteriophage 1 82 genome. 

Four micrograms of phage DNA was diluted in 200 ^1 of TE (10 mM Tris, 
30 [pH 8.0], 1 mM EDTA) in a 1.5 ml eppendorf tube and sonication was performed 
(550 Sonic Dismembrator, Fisher Scientific). Samples were sonicated under an, 
amplitude of 3 jim with bursts of 5 s spaced by 15 s cooling in ice/water for 3 to 4 
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cycles. The sonicated DNA was then size fractionated by electrophoresis on 1% 
agarose gels utilizing TAE (1 x TAE is: 40 mM Tris-acetate, 1 mM EDTA [pH 8.0]) 
as the running buffer. Fractions ranging from 1 to 2 kbp were excised from the 
agarose gel and purified using a commercial DNA extraction system according to the 
5 instructions of the manufacturer (Qiagen), with a final elution of 50 pi of I mM Tris 
[pH8.5]. 

The ends of the sonicated DNA fragments were repaired with a combination of 
T4 DNA polymerase and the Klenow fragment of E. coli DNA polymerase I, as 
follows. Reactions were performed in a reaction mixture (final volume, 100 pi) 

10 containing sonicated phage DNA, 10 mM Tris-HCl [pH 8.0], 50 mM NaCl, 10 mM 
MgCI 2 , 1 mM DTT, 50 pg/ml BSA, 100 pM of each dNTP and 15 units of T4 DNA 
polymerase (New England Bio labs) for 20 min at 12°C followed by addition of 12.5 
units of the Klenow large fragment of DNA polymerase I(New England Biolabs) for 
15 min at room temperature. The reaction was stopped by two phenol/chloroform 

15 extractions and the DNA was precipitated with ethanol and the final DNA pellet 
resuspended in 20 pi of H 2 0. 

Blunt-ended DNA fragments were cloned by ligation directly into the Hinc U 
site of the pKSH+ vector (New England Biolabs) dephosphoryiated by treatment with 
calf intestinal alkaline phosphatase (New England Biolabs). A typical ligation reaction 

20 contained 100 ng of vector DNA, 2 to 5 pi of repaired sonicated phage DNA (50-100 
ng) in a final volume of 20 pi containing 800 units of T4 DNA ligase (New England 
Biolabs) and was incubated overnight at 16°C. Transformation and selection of 
bacterial clones containing recombinant plasmids was performed in E. coli DH10P 
according to standard procedures (Sambrook et al, 1989). 

25 Recombinant clones were picked from agar plates into 96-well plates 

containing 100 pi LB and 100 pg/ml ampicillin and incubated at 37°C. The presence 
of phage DNA insert was confirmed by PCR amplification using T3 and T7 primers 
flanking the Hinc II cloning site of the pKS vector. PCR amplification of the potential 
foreign inserts was performed in a 15 pi reaction volume containing 10 mM Tris (pH 

30 8.3), 50 mM KC1, 1 .5 mM MgCl 2 , 0.02% gelatin, 1 pM primer, 1 87.5 pM each dNT-Pr 
and 0.75 units Taq polymerase (BRL). The thermocycling parameters were as 
follows: 2 min initial denaturation at 94°C for 2 min, followed by 20 cycles of 30 sec 
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denaturation at 94°C, 30 sec annealing at 58°C, and 2 min extension at 72°C, 
followed by a single extension step at 72 °C for 10 min. Clones with insert sizes of 1 
to 2 kbp were selected and plasmid DNA was prepared from the selected clones using 
the QIAprep™ spin miniprep kit (Qiagen). 
5 The nucleotide sequence of the extremities of each recombinant clone was 

determined using an ABI 377-36 automated sequencer with two types of chemistry: 
ABI prism Big Dye™ primer cycle sequencing (21M13 primer: #403055)(M13REV 
primer: #403056) or ABI prism Big Dye™ terminator cycle sequencing ready reaction 
kit (Applied Biosystems; #4303152). To ensure co-linearity of the sequence data and 
10 the genome, all regions of the phage genome were sequenced at least once from both 
directions on two separate clones. In areas that this criteria was not initially met, a 
sequencing primer was selected and phage DNA was used directly as sequencing 
template employing ABI prism BigDye™ terminator cycle sequencing ready reaction 
kit. 

15 

Example 13. Bioinformatic management of primary nucleotide sequence. 

Sequence contigs were assembled using Sequencher™ 3.1 software 
(GeneCodes). To close contig gaps, sequencing primers were selected near the edge of 
the contigs. Phage DNA was used directly as sequencing template employing ABI 
20 prism BigDye™ terminator cycle sequencing ready reaction kit (Applied Biosystems; 
#4303152). The complete sequence of Enterococcus bacteriophage 182 is shown in 
Table 21. 

A software program was used on the assembled sequence of bacteriophage 182 
to identify all putative ORFs larger than 33 codons. The software scans the primary 

25 nucleotide sequence starting at nucleotide #1 for an appropriate start codon. Three 
possible selections can be made for defining the nature of the start codon; I) selection 
of ATG, II) selection of ATG or GTG, and III) selection of either ATG, GTG, TTG, 
CTG, ATT, ATC, and ATA. This latter initiation codon set corresponds to the one 
reported by the NCBI fhttp://www,ncbi.nlm.nih.gov/htbin- 

30 post/Taxonomv/wprintgc?mode=c) for the bacterial genetic code. When an 

appropriate start codon is encountered, a counting mechanism is employed to count ~ 
the number of codons (groups of three nucleotides) between this start codon and the 
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next stop codon downstream of it. If a threshold value of 33 is reached, or exceeded, 
then the sequence encompassed by these two codons is defined as an ORF. This 
procedure is repeated, each time starting at the next nucleotide following the previous 
stop codon found, in order to identify all the other putative ORFs. The scan is 
5 performed on all three reading frames of both DNA strands of the phage sequence. 
The predicted ORFs for bacteriophage 182 are listed in Tables 22 & 23. 
Sequence homology searches for each ORF were carried out using an implementation 
of BLAST programs. Downloaded public databases used for sequence analysis 
include: 

10 (i) non-redundant GenBank (ftp://ncbi.nlm.nih.gOv/blast/db/nr.Z), 

ii) Swissprot (ftp://ncbi.nlm.nih.gov^last/db/swissprot.Z); 

iii) vector (ftp://ncbi.nlm.nih.g0v/blast/db/vect0r.Z); 

iv) pdbaa databases (ftp://ncbi.nlm.nih.g0v/blast/db/pdbaa.Z); 

v) staphylococcus aureus NCTC 8325 (ftp://ftp.genome.ou.edu/pub/staph/staph- 
15 lk.fa); 

vi) streptococcus pyrogenes 

(ftp://ftp.tigr.Org/pub/data/s_pneumoniae/gsp.contigs.112197.Z); 

vii) PRODOM 

(ftp://ftp.toulouse.inra.fr/Dub/prodom/current release/prodom99. 1 .forblastgz) : 
20 viii) DOMO fftp://ftp.infobiogen.fr/pub/db/domo/ l: 

ix) TREMBL (ftp://www.expasy.ch/databases/sp_tr_nrdb/fasta/) 

The results of the homology searches performed on the ORFs of bacteriophage 
1 82 are shown in Tables 24 & 26. 

25 Example 14. Sub-Cloning of Bacteriophage 182 ORFs. 
Preparation of the shuttle expression vector 

Expression preferably utilizes a shuttle expression vector which is arranged 
such that expression of the exogenous bacteriophage 182 ORF sequence is inducible. 
For example, the plasmid pND50 replicates in E. coli, E.faecalis, and S. aureus 

30 (Yamagishi, J., Kojima, T., Oyamada, Y., Fujimoto, K., Hattori, H., Nakamura, S., 
and Inoue, M. 1996. Antimocrob, Agents Chemother, 40, 1 157-1 163). This plasmid-- 
can be modified by conventional techniques to insert the inducible arsenite promoter, 
derived from the shuttle vector pT0021, in which the firefly luciferase (lucFF) 
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expression is controlled by the ars promoter/operator from a S. aureus plasmid 
(Tauriainen, S., Karp, M., Chang, W and Virta, M. (1997). Recombinant luminescent 
bacteria for measuring bioavailable arsenite and antimonite. Appl. Environ. Microbiol 
63:4456-4461). This modified shuttle vector will contain the ars promoter, arsR gene 
5 and a cloning site for introduction of individual phage ORFs downstream from a 
shine-delgarno sequence. 

Other inducible regulatory sequences can be utilized instead of the arsenite- 
inducible system. An example is a nisin-inducible system The nisA promoter activity 
is dependent on the proteins NisR and NisK, which constitute a two-component signal 

10 transduction system that responds to the extracellular inducer nisin. The nisin 

sensitivity and inducer concentration required for maximal induction varies among the 
strains, but is functional in Streptococcus pyogenes, Streptococcus agalactiae. 
Streptococcus pneumoniae* Enterococcus faecalis, and Bacillus subtilis. Significant 
induction of the nisA promoter (10- to 60-fold induction) can be obtained in all of the 

15 species. A vector containing this promoter was published as Eichenbaum Z, Federle 
MJ, Marra D, de Vos WM, Kuipers OP, Kleerebezem M, and Scott JR (1998) Appl 
Environ Microbiol 64, 2763-2769. Other vectors, e.g., plasmids, can also be utilized 
which will allow replication and transciption in Enterococcus. 

Alternatively, a constitutive promoter can be used (e.g„ the p-lactamase 

20 promoter is constitutive in E. faecalis - see ref. 1) to drive expression of the 

introduced ORF, and compare cell growth to control bacterial cells containing the 
parental vector lacking any introduced phage ORF. Recombinant plasmids are 
introduced into E. faecalis strain FA2-2 by electroporation, as previously described 
(Yamagishi, J., Kojima, T., Oyamada, Y., Fujimoto, K., Hattori, H., Nakamura, S., 

25 and Inoue, M. 1996. Antimicrob. Agents Chemother. 40, 1 157-1 163). 
Cloning of ORFs with a Shine-Dalgarno sequence 

ORFs with a Shine-Dalgarno sequence are selected for functional analysis of 
bacterial killing. Each ORF, from initiation codon to last codon (excluding the stop 
codon), will be amplified by PCR from phage genomic DNA. For PCR amplification 

30 of ORFs, each sense strand primer starts at the initiation codon and is preceded by a 
restriction site and each antisense strand starts at the last codon (excluding the s£op~~ " 
codon) and is preceded by a different restriction site. The PCR product of each ORF 
will be gel purified and digested with the restriction enzymes with sites contained on 
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the PCR oligonucleotides. The digested PCR product is then gel purified using the 
Qiagen kit, ligated into the modified shuttle vector, and used to transform bacterial 
strain DHlOp. Recombinant clones are then picked and their insert sizes confirmed by 
PCR analysis using primers flanking the cloning site as well as restriction digestion. 
5 The sequence fidelity of cloned ORFs will be verified by DNA sequencing using the 
same primers as used for PCR. In the cases that the verification of ORFs can not be 
achieved by one path of sequencing using primers flanking the cloning site internal 
primers will be selected and used for sequencing. Recombinant plasmids will be 
introduced into E. faecalis strain FA2-2 by electroporation, as previously described 

10 (Y amagishi, J., Kojima, T., Oyamada, Y., Fujimoto, K., Hattori, H., Nakamura, S., 
and Inoue, M. 1996. Antimicrob. Agents Chemother. 40, 1 157-1 163), 
Induction of gene expression from the ars promoter. 

If an inducible promoter is used, e.g., the ars promoter, induction can be 
assessed, for example, in either of the two methods. 

15 1. Screening on agar p lates 

The functional identification of killer ORFs can be performed by spreading an aliquot 
of E. faecalis transformed cells containing phage 1 82 ORF onto agar plates containing 
different concentrations of sodium arsenite (0; 2.5; 5; and 7.5 \iM). The plates are 
incubated overnight at 37°C, after which a growth inhibition of the ORF 

20 transformants on plates that contain arsenite are compared to plates without arsenite. 
2. Quantification of growth inhibition in liquid medium 

Cells containing different recombinant plasmids can be grown for overnight at 
37°C in LB medium supplemented with the appropriate antibiotic selection. These are 
then diluted to the mid log phase (OD 540 =.2) with fresh media containing antibiotic 

25 and transferred to 96-well microtitration plates (100 |il/well). Inducer is then added at 
different final concentrations (ranging from 2.5 to 10 \xM) and the culture incubated 
for an additional 2 h at 37°C. The effect of expression of the phage 182 ORFs on 
bacterial cell growth is then monitored by measuring the OD 540 and comparing the rate 
of growth to the culture not containing inducer. As positive controls for growth 

30 inhibition, the kilA gene of phage lambda (Reisinger, GR., Rietsch, A., Lubitz, W. and . , 
Blasi, U. 1993 Virology #193: 1033-1036), and the holin/lysin genes T>f~the 
Staphylococcus aureus phage Twort (Loessner, MJ., Gaeng, S., Wendlinger, G., 
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Maier, SK. and Scherer, S. 1998. FEMS Microbiology Letters #162:265-274) were 
subcloned into the ars inducible vector. An aliquot of the induced and uninduced 
culture can also be plated out on agar plates containing an appropriate antibiotic 
selection but lacking inducer. Following incubation overnight at 37°C, the number of 
5 colonies is counted. Any ORF showing bacteriostatic activity will show a lower, but 
detectable, number of colonies on the agar plates when grown in the presence of 
inducer as compared to when grown in the absence of inducer. Any ORF showing 
bacteriocidal activity will show no colonies on the agar plates, when grown in the 
presence of inducer as compared to when grown in the absence of inducer. 

10 
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Example 15. Growth of Streptococcus bacteriophage Dp-1 and purification of 
genomic DNA . 

The Streptococcus pneumoniae R6 propagating strain (PS) (Tomasz, 
1966) was used as host to propagate its respective phage Dp-1 (McDonnell et al., 

25 1975). (Alternatively, Streptococcus (Diplococcus) pneumoniae R36A could be used. 
Strain R36A is available from ATCC as #1 1733 or 27336. Streptococcus pneumoniae 
is also available from Felix d'Herelle Reference Center in Quebec, Canada as catalog 
number HER 1054. Other S. pneumoniae strains are also available from ATCC.) 
Two rounds of plaque purification of phage Dp-1 were performed on soft agar 

30 essentially as described in Sambrook et al (1989). Briefly, the Streptococcus R6 PS 

strain was grown overnight at 37°C in K-Cat media [K-Cat:10 g Bacto casitone, 5 g w . - - 
Bacto tryptone, 1 g Yeast extract, 5g Potassium chloride, 0.2% Glucose, 30mM* 
Potassium phosphate buffer [pH 8] and 250,000 Units Catalase per liter (Boehringer 
Mannheim #10683600). The culture was then diluted 20 fold in K-CAT and 
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incubated at 37°C until the OD 340 = 0.2 (early log phase) with constant agitation. In 
order to obtain single plaques, Dp-1 phage was subjected to 10-fold serial dilutions 
using the phage buffer (100 mM Tris-HCl [pH 7.5], 100 mM NaCl and 10 mM 
MgCl 2 )and 10 jil of each dilution was used to infect 0.5 ml of the cell suspension. 
5 After incubation of 15 min at 37°C, 2 ml of melted soft agar (K-CAT supplemented 
with 0.8% of agar) were added to the mixture and poured onto the surface of 100 mm 
K-CAT agar plates [K-CAT supplemented with 1.2 % of agar]. After solidification of 
the soft agar layer, an additional 5 ml of melted soft agar was added to visualize 
distinct plaques (Ronda et al, 1978). After overnight incubation at 37°C, a single 

10 plaque was isolated, resuspended in 1 ml of phage buffer by end over end rotation for 
2 hrs at room temperature, and the phage suspension was diluted and used for a 
second infection as described above. After overnight incubation at 37°C, a single 
plaque was isolated and used as a stock for all subsequent manipulations. 

The propagation procedure for bacteriophage Dp-1 was modified from the 

15 agar layer method of Swanstorm and Adams (1951). Briefly, the R6 strain of 

Streptococcus pneumoniae was grown to stationary phase overnight at 37°C in K- 
CAT. The culture was then diluted 20 fold in K-CAT and incubated at 37°C until the 
OD 540 = 0.2. The suspension (15xl0 7 Bacteria) was then mixed with 15x10* plaque 
forming units (pfu) to give a ratio of 100-bacteria/pfu. After incubation of 15 min at 

20 37°C, 7.5 ml of melted soft agar (K-CAT plus 0.8% agar) were added to the mixture 
and poured onto the surface of 150 mm K-CAT agar plates and incubated 16 hrs at 
37°C. After solidification of the soft agar layer, 7.5 ml of melted soft agar were added 
to each plate. To collect the plate lysate, 20 ml of K-CAT media were added to each 
plate and the soft agar layers were collected by scrapping off with a clean microscope 

25 slide followed by vigorous shaking of the agar suspension for 5 min to break up the 
agar. The mixture was then centrifiiged for 10 min at 4,000 rpm (2,830 xg) using a 
JA-10 rotor (Beckman) and the supernatant (lysate) was collected and subjected to a 
treatment with 10 jag /ml of DNase I and RNase A for 30 min at 37°C. To precipitate 
the phage particles, the phage suspension was adjusted to 10% (w/v) of PEG 8000 and 

30 0.5 M of NaCl followed by incubation at 4°C for 16 hrs. The phage was recovered by 
centrifugation at 4,000 rpm (3,500 xg) for 20 min at 4°C on a GS-6R table top 
centrifuge (Beckman). The pellet was resuspended with 2 ml of phage buffer (100 
mM Tris-HCl [pH 7.5], 100 mM NaCl and 10 mM MgCl 2 ). The phage suspension 
was extracted with 1 volume of chloroform and further purified by centrifugation on a 

35 cesium chloride step gradient as described in Sambrook et al (1989), using a TLS-5-5 ~* 
rotor and centrifiiged in an Optima TLX ultracentrifiige (Beckman) for 2 hrs at 28,000 
rpm (67,000 xg) at 4°C. Banded phage was collected and ultracentrifuged again on an 
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isopycnic cesium chloride gradient (1.45 g/ml) at 40,000 rpm (64,000 xg) for 24 hrs at 
4°C using a TLV rotor (Beckman). The phage was harvested and dialyzed for 4 hrs at 
room temperature against 4 L of dialysis buffer consisting of 10 mM NaCl, 50 raM 
Tris-HCl [pH 8] and 10 mM MgCl 2 . Phage DNA was prepared from the phage 
5 suspension by adding 20 raM EDTA, 50 ng/ml Proteinase K and 0.5% SDS and 
incubating for 1 hr at 65°C, followed by successive extractions with 1 volume of 
phenol, 1 volume of phenol-chloroform and 1 volume of chloroform. The DNA was 
then dialyzed overnight at 4°C against 4 L of TE (10 mM Tris-HCl [pH 8.0], ImM 
EDTA). 

10 

Example 16. DNA sequencing of the Bacteriophage Dp-1 genome. 

Four micrograms of phage DNA was diluted in 200 |il of TE (10 mM Tris, 
[pH 8.0], 1 mM EDTA) in a 1.5 ml eppendorf tube and sonication was performed 
(550 Sonic Dismembrator, Fisher Scientific). Samples were sonicated under an 

15 amplitude of 3 ^tm with bursts of 5 sec spaced by 15 sec cooling in ice/water for 3 to 4 
cycles. The sonicated DNA was then size fractionated by electrophoresis on 1% 
agarose gels utilizing TAE (1 x TAE is: 40 mM Tris-acetate, 1 mM EDTA [pH 8.0]) 
as the running buffer. Fractions ranging from 1 to 2 kbp were excised from the 
agarose gel and purified using a commercial DNA extraction system according to the 

20 instructions of the manufacturer (Qiagen), with a final elution of 50 \il of 1 mM Tris 
tpH8.5]. 

The ends of the sonicated DNA fragments were repaired with a combination of 
T4 DNA polymerase and the Klenow fragment of E. coli DNA polymerase I, as 
follows. Reactions were performed in a reaction mixture (final volume, 100 |il) 

25 containing sonicated phage DNA, 10 mM Tris-HCl [pH 8.0], 50 mM NaCl, 10 mM 
MgCl 2 , 1 mM DTT, 50 ^ig/ml BSA, 100 yM of each dNTP and 15 units of T4 DNA 
polymerase (New England Biolabs) for 20 min at 12°C followed by addition of 12.5 
units of the Klenow large fragment of DNA polymerase I (New England Biolabs) for 
15 min at room temperature. The reaction was stopped by two phenol/chloroform 

30 extractions and the DNA was precipitated with ethanol and the final DNA pellet 
resuspended in 20 \x\ of H 2 0. 

Blunt-ended DNA fragments were cloned by ligation directly into the Hinc II 
site of the pKSII+ vector (New England Biolabs) dephosphorylated by treatment with 
calf intestinal alkaline phosphatase (New England Biolabs). A typical ligation 

35 reaction contained 100 ng of vector DNA, 2 to 5 \il of repaired sonicated phage DNA * 
(50-100 ng) in a final volume of 20 jal containing 800 units of T4 DNA ligase (New 
England Biolabs) and was incubated overnight at 16°C. Transformation and selection 
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of bacterial clones containing recombinant plasmids was performed in £. coli DHlOp 
according to standard procedures (Sambrook et aL, 1989). 

Recombinant clones were picked from agar plates into 96-weIl plates 
containing 100 (il LB and 100 jxg/ml ampicillin and incubated at 37°C. The presence 
5 of phage DNA insert was confirmed by PCR amplification using T3 and T7 primers 
flanking the Hinc II cloning site of the pKS vector. PCR amplification of the potential 
foreign inserts was performed in a 15 (il reaction volume containing 10 mM Tris (pH 
8.3), 50 mM KC1, 1.5 mM MgCl 2 , 0.02% gelatin, 1 \iM primer, 187.5 ^M each dNTP, 
and 0.75 units Taq polymerase (BRL). The thermocycling parameters were as 

1 0 follows: 2 min initial denaturation at 94°C for 2 min, followed by 20 cycles of 30 sec 
denaturation at 94°C, 30 sec annealing at 58°C, and 2 min extension at 72°C, 
followed by a single extension step at 72°C for 10 min. Clones with insert sizes of 1 
to 2 kbp were selected and plasmid DNA was prepared from the selected clones using 
the QIAprep™ spin miniprep kit (Qiagen). 

15 The nucleotide sequence of the extremities of each recombinant clone was 

determined using an ABI 377-36 automated sequencer with two types of chemistry: 
ABI prism Big Dye™ primer cycle sequencing (21M13 primer: #403055)(M13REV 
primer: #403056) or ABI prism Big Dye™ terminator cycle sequencing ready reaction 
kit (Applied Biosystems; #4303 152). To ensure co-linearity of the sequence data and 

20 the genome, all regions of the phage genome were sequenced at least once from both 
directions on two separate clones. In areas that this criteria was not initially met, a 
sequencing primer was selected and phage DNA was used directly as sequencing 
template employing ABI prism Big Dye™ terminator cycle sequencing ready reaction 
kit. 

25 

Example 17. Bioinformatic management of primary nucleotide sequence. 

Sequence contigs were assembled using Sequencher™ 3.1 software 
(GeneCodes). To close contig gaps, sequencing primers were selected near the edge 
of the contigs. Phage DNA was used directly as sequencing template employing ABI 
30 prism BigDye™ terminator cycle sequencing ready reaction kit (Applied Biosystems; 
#4303152). The complete sequence of Streptococcus bacteriophage Dp-1 is shown in 
Table 28. 

A software program was used on the assembled sequence of bacteriophage 
Dp-1 to identify all putative ORFs larger than 33 codons. The software scans the 
35 primary nucleotide sequence starting at nucleotide #1 for an appropriate start codonr 
Three possible selections can be made for defining the nature of the start codon; I) 
selection of ATG, II) selection of ATG or GTG, and HI) selection of either ATG, 
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GTG, TTG, CTG, ATT, ATC, and ATA. This latter initiation codon set corresponds 
to the one reported by the NCB K http ://www.ncbi .nlm.nih. gov/htbin- 
Dost/Taxonomv/wprintgc?mode=c) for the bacterial genetic code. When an 
appropriate start codon is encountered, a counting mechanism is employed to count 
5 the number of codons (groups of three nucleotides) between this start codon and the 
next stop codon downstream of it. If a threshold value of 33 is reached, or exceeded, 
then the sequence encompassed by these two codons is defined as an ORF. This 
procedure is repeated, each time starting at the next nucleotide following the previous 
stop codon found, in order to identify all the other putative ORFs. The scan is 

10 performed on all three reading frames of both DNA strands of the phage sequence. 
The predicted ORFs for bacteriophage Dp-1 are listed in Tables 29 and 30, and Fig. 6. 

Sequence homology searches for each ORF were carried out using an 
implementation of BLAST programs. Downloaded public databases used for 
sequence analysis include: 

15 (i) non-redundant GenBank (ftp://ncbi.nlm.nih.gov^last/db/nr.Z), 

ii) Swissprot (ftp://ncbi.nlm.nih.gov^last/db/swissprot.Z); 

iii) vector (ftp://ncbi.nlm.nih.gov^last/db/vecto^.Z); 

iv) pdbaa databases (ftp://ncbi.nlm.nih.gOv/blast/db/pdbaa.Z); 

v) staphylococcus aureus NCTC 8325 
20 (ftp://ftp.genome.ou.edu/pub/staph/staph-lk.fa); 

vi) streptococcus pyogenes 
(ftp://ftp.tigr.org/pub/data/s_pneumoniae/gsp.contigs. 1 1 2 1 97.Z); 

vii) PRODOM 

( ftp://ftp.toulouse.inra.fr/pub/prodom/cuirent release/prodom99. 1 .forblast.gz'i : 
25 viii) DOMO (ftp://ftp.infobiogen.fr/pub/db/domoA : 

ix) TREMBL (ftp://www.expasy.ch/databases/sp_tr_nrdb/fasta/) 

The results of the homology searches performed on the ORFs of bacteriophage 
Dp- 1 are shown in Table 31. 

30 

Example 18. Sub-Cloning of Bacteriophage Dp-1 ORFs. 
Preparation of the shuttle expression vector 

Expression preferably utilizes a shuttle expression vector which is arranged 
such that expression of the exogenous bacteriophage Dp-1 ORF sequence is inducible. 
35 For example, the plasmid pLSE4 replicates in E. coli, and S. pneumoniae (Diaz- and 
Garcia, 1990). This plasmid can be modified by conventional techniques to insert the 
inducible arsenite promoter, derived from the shuttle vector pT0021, in which the 
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firefly luciferase (lucFF) expression is controlled by the ars promoter/operator from a 
S. aureus plasmid (Tauriainen, S., Karp, M, Chang, W and Virta, M. (1997). 
Recombinant luminescent bacteria for measuring bioavailable arsenite and antimonite. 
Appl. Environ. Microbiol 63:4456-4461). This modified shuttle vector will contain 
5 the ars promoter, arsR gene and a cloning site for introduction of individual phage 
ORFs downstream from a shine-dalgamo sequence. 

Other inducible regulatory sequences can be utilized instead of the arsenite- 
inducible system. An example is a nisin-inducible system The nisA promoter activity 
is dependent on the proteins NisR and NisK, which constitute a two-component signal 

1 0 transduction system that responds to the extracellular inducer nisin. The nisin 

sensitivity and inducer concentration required for maximal induction varies among the 
strains, but is functional in Streptococcus pyogenes, Streptococcus agalactiae, 
Streptococcus pneumoniae, Enterococcus faecalis, and Bacillus subtilis. Significant 
induction of the nisA promoter (10- to 60-fold induction) can be obtained in all of the 

1 5 species. A vector containing this promoter was published as Eichenbaum Z, Federle 
MJ, Marra D, de Vos WM, Kuipers OP, Kleerebezem M, and Scott JR (1998) Appl 
Environ Microbiol 64, 2763-2769. Other vectors, e.g., plasmids, can also be utilized 
which will allow replication and transcription in Streptococcus. 

Alternatively, a constitutive promoter can be used to drive expression 

20 of the introduced ORF, and compare cell growth to control bacterial cells containing 
the parental vector lacking any introduced phage ORF. Recombinant plasmids are 
introduced into S. pneumoniae R6 as previously described (Diaz and Garcia, 1990) 

Cloning of ORFs with a Shine-Dalgamo sequence 

25 ORFs with a Shine-Dalgamo sequence are selected for functional analysis of 

bacterial killing. Each ORF, from initiation codon to last codon (excluding the stop 
codon), will be amplified by PCR from phage genomic DNA. For PCR amplification 
of ORFs, each sense strand primer starts at the initiation codon and is preceded by a 
restriction site and each antisense strand starts at the last codon (excluding the stop 

30 codon) and is preceded by a different restriction site. The PCR product of each ORF 
will be gel purified and digested with the restriction enzymes with sites contained on 
the PCR oligonucleotides. The digested PCR product is then gel purified using the 
Qiagen kit, ligated into the modified shuttle vector, and used to transform bacterial 
strain DHlOp. Recombinant clones are then picked and their insert sizes confirmed 

35 by PCR analysis using primers flanking the cloning site as well as restriction- 1_ * 
digestion. The sequence fidelity of cloned ORFs will be verified by DNA sequencing 
using the same primers as used for PCR. In the cases that the verification of ORFs 
can not be achieved by one path of sequencing using primers flanking the cloning site 
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internal primers will be selected and used for sequencing. Recombinant plasmids will 
be introduced into £ pneumoniae R6 as previously described (Diaz and Garcia, 1990). 
Induction of gene expression from the ars promoter. 

If an inducible promoter is used, e.g., the ars promoter, induction can be 
5 assessed, for example, in either of the two methods. 

1. Screening on agar plates 

The functional identification of killer ORFs can be performed by spreading an 
aliquot of S. pneumoniae transformed cells containing phage Dp-1 ORFs onto agar 
plates containing different concentrations of sodium arsenite (0; 2.5; 5; and 7.5 jaM). 
1 0 The plates are incubated overnight at 37°C, after which a growth inhibition of the 
ORF transformants on plates that contain arsenite are compared to plates without 
arsenite. 

2. Quantification of growth inhibition in liquid medium 

Cells containing different recombinant plasmids can be grown for overnight at 

15 37°C in LB medium supplemented with the appropriate antibiotic selection. These are 
then diluted to the mid log phase (Op 540 =.2) with fresh media containing antibiotic 
and transferred to 96-well microtitration plates (100 |il/well). Inducer is then added at 
different final concentrations (ranging from 2.5 to 10 yM) and the culture incubated 
for an additional 2 hrs at 37°C. The effect of expression of the phage Dp-1 ORFs on 

20 bacterial cell growth is then monitored by measuring the OD^ and comparing the rate 
of growth to the culture not containing inducer. [As positive controls for growth 
inhibition, the kilA gene of phage lambda (Reisinger, GR., Rietsch, A., Lubitz, W. and 
Blasi, U. 1993 Virology #193: 1033-1036), and the holin/lysin genes of the 
Staphylococcus aureus phage Twort (Loessner, MJ., Gaeng, S., Wendlinger, G., 

25 Maier, SK. and Scherer, S. 1 998. FEMS Microbiology Letters #162:265-274) can be 
subcloned into the ars inducible vector. An aliquot of the induced and uninduced 
culture can also be plated out on agar plates containing an appropriate antibiotic 
selection but lacking inducer. Following incubation overnight at 37°C, the number of 
colonies is counted. Any ORF showing bacteriostatic activity will show a lower, but 

30 detectable, number of colonies on the agar plates when grown in the presence of 
inducer as compared to when grown in the absence of inducer. Any ORF showing 
full bacteriocidal activity will show no colonies on the agar plates, when grown in the 
presence of inducer as compared to when grown in the absence of inducer. 
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All patents and publications mentioned in the specification are indicative of 

1 0 the levels of skill of those skilled in the art to which the invention pertains. All 

references cited in this disclosure are incorporated by reference to the same extent as 
if each reference had been incorporated by reference in its entirety individually. 

One skilled in the art would readily appreciate that the present invention is 
well adapted to carry out the objects and obtain the ends and advantages mentioned, 

15 as well as those inherent therein. The specific methods and compositions described 
herein as presently representative of preferred embodiments are exemplary and are not 
intended as limitations on the scope of the invention. Changes therein and other uses 
will occur to those skilled in the art which are encompassed within the spirit of the 
invention are defined by the scope of the claims. 

20 It will be readily apparent to one skilled in the art that varying substitutions 

and modifications may be made to the invention disclosed herein without departing 
from the scope and spirit of the invention. For example, those skilled in the art will 
recognize that the invention may suitably be practiced using a variety of different 
bacteria, bacteriophage, and sequencing methods within the general descriptions 

25 provided. 

The invention illustratively described herein suitably may be practiced in the 
absence of any element or elements, limitation or limitations which is not specifically 
disclosed herein. Thus, for example, in each instance herein any of the terms 
"comprising," "consisting essentially of and "consisting of may be replaced with 

30 either of the other two terms. The terms and expressions which have been employed 
are used as terms of description and not of limitation, and there is not intention that in 
the use of such terms and expressions of excluding any equivalents of the features 
shown and described or portions thereof, but it is recognized that various 
modifications are possible within the scope of the invention claimed. Thus, it should 

35 be understood that although the present invention has been specifically disclosed by 
preferred embodiments and optional features, modification and variation of the 
concepts herein disclosed may be resorted to by those skilled in the art, and thafsuch 
modifications and variations are considered to be within the scope of this invention as 
defined by the appended claims. 
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In addition, where features or aspects of the invention are described in terms of 
Markush groups or other grouping of alternatives, those skilled in the art will 
recognize that the invention is also thereby described in terms of any individual 
member or subgroup of members of the Markush group or other group. For example, 
5 if there are alternatives A, B, and C, all of the following possibilities are included: A 
separately, B separately, C separately, A and B, A and C, B and C, and A and B and 
C. Thus, for example, for the bacteria and phage specified herein, the embodiments 
expressly include any subset or subgroup of those bacteria and/or phage. While each 
such subset or subgroup could be listed separately, for the sake of brevity, such a 
1 0 listing is replaced by the present description. 

Thus, additional embodiments are within the scope of the invention and within 
the following claims. 
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Table 1 



Phages against human and animal pathogenic bacteria 

5 



I. Pathogen 
name 


Phage name 


II. Cat 
alo 


Origin/reference 


A cinetobacter 
calcoaceticus 


A3/2 
AlO/45 
A36 
B9GP 

15 "DD 

BS46 
E13 

E14 

531 




Felix d'Herelle Reference 
Centre,Quebec,Quebec 




Ap3 
P78 




i. Bacteriol 1984. 157: 179-183 

J. Gen. Microbiol 1986.132: 2633-2636 


Acinetobacter 
haemolyticus 






Felix d'Herelle Reference 
Centre,Quebec,Quebec 


Acinetobacter 
johnsonii 






Felix d'Herelle Reference 
Centre.Quebec,Quebec 


Acinetobacter sp. 


BP1 




J.Virol.l968.2:716-722 




G4,HP2,HP3& 
HP4 




Can.J.Microbiol.l966.12:10234030& 

J.Virol.l974.13:46-52& 

Arch. Virol. 1 994. 1 35 :345-354 




A1,A4, A9& 
196 




Arch. Virol. 1 994. 1 35 :345-354 




HP1 




Can.J.Microbiol. 1966. 12: 1023-1030 




A 19, A23,A29, 
A31, A33,A34, 
A3759 & 2845 




J.Microsc (Paris) 1973.16:215-224 & 
CR.Hebdo Seances Acad.Sci.Ser D.Sci 
Natur(Paris)278:1907-1909 & 
Arch.Virol.l994.135:345-354 & 
Rev.Can.Biol. 1970.29:3 1 7-320 


Actinobacillus 
actinomycetecomitans 






FEMS Microbiol Lett 1 994. 1 1 9:329-337.-. - ■■ 
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/ 









miec, irnmun. lyo^. Jj: o43-i49 








Mol Gen Genet 1998 258- 323-325 




Aa(p247 




Oral Micriol. Immunol 1997.12: 40-46 


Actinomyces viscosus 




43146-B1 


The American Type Culture Collection 








Infect.Irnmun.l985.48:228-233 








InfectImmun.l988.56:54-59 








Plasmid 1997.37:141-153 


Aeromonas hydrophila 


PM2** &PM3 




FEMS MicrobioLLett. 1990.57:277-282 




Aehl 




Felix d'Herelle Reference 




Aeh2 




Centre,Quebec,Quebec 




PM4 








PM5 








PM6 








T7-ah 
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Aeromonas 
salmonicida 


3 

25 
29 

i\ 
j i 

J*. 

4 tUI\_i\.2.8 i 

43 
51 
56 
59.1 
65 

Asp37 




Felix d'Herelle Reference 
Centre,Quebec,Quebec 




55R.1 




Can. J. Microbiol. 1983. 29: 1458-1461 


Alteromonas espejiana 


PM2** 


27025-B1 


The American Type Culture Collection 


Asticacaulis 
biprosthecum 






Felix d'Herelle Reference 
Centre,Quebec,Quebec 


Asticcacaulis 
excentricus 


<(>Ac21 
4>Ac24 


15261-B1 
15261-B2 
15261-B3 


The American Type Culture Collection 


Azotobacter vinelandii 


A14 
A21 
A31 
A41 
PAVl 


12518-Bl 

12518-B4 

12518-B5 

12518-B9 

12518-BlO 

13705-Bl 


The American Type Culture Collection 


Azotobacter sp. 






Virology 1972.49:439-452 


Bacteroides fragilis 


Bf-i 




Rev. Infect. Dis. 1979. 1: 325-336 




B40-8 




FEMS Microbiol. Lett. 1991. 66: 61-67 




HSP40 




Appl. Environ. Microbiol. 1989. 55: 2696- 
2701 




phiAl 




Zentralbl.bakteriol. 1 972.222:57-63 


Bdellovibrio 
bacteriovorus 


MAC-1 




J. Gen. Microbiol. 1987. 133: 3065-3070 


Bdellovibrio sp. 


VL-1 




J.Virol.l973.12:1522-1533 


Bordetella 
brochiseptica 


214 




Zh.MikrobioI.Epidemiol.Immuno. 1987.5:9- 
13 
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Bordetella 
parapertussis 






Felix d'Herelle Reference 
Centre,Quebec,Quebec 








Mol. Gen. Mikrobiol. Virusol. 1988.4: 22-25 








Zh.Mikrobiol.Epiderruol.Iirimuno. 1987.5:9- 
13 




41405 




Zh.Milcrobiol.Epidemiol.Inimuno. 1987.5:9- 
13 


Brucella abortus 






Felix d'Herelle Reference 
Centre,Quebec,Quebec 






23448-B1 
7144R-R2 

23448-B3 
17385-B1 
17385-B2 


The American Type Culture Collection 




10/1 

24/11 

212/XV 








BK-2.TB & 
Fi** 




Zh.MikrobioLEpidemiol.Irnmunobiol. 1 983.2: 
48-52 




R/c&R/O 




Dev. Biol. Stand. 1984.56: 55-62 . . 


Brucella canis 


R/c 




Dev. Biol. Stand 1984.56: 55-62 


Brucella melitensis 


BK-2 


23456-B1 


The American Type Culture Collection 


Brucella suis 


Wb 




Zentralbl.Veterinarmed.l975.22:866-867 
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Fi** & TB 




Zh.Mikrobiol.Epidenuol.Immunobiol. 1 983.2: 








48-52 


Brucella sp. 






Can. J. Vet. Res. 1989.53: 319-325 








Res. Vet. Sci. 1988. 44: 45^9 




R 




Zh.Mikrtobiol.EpidemioLInimunobioL1983.2: 








A O 


Campylobacter colt 




43133-B1 


The American Type Culture Collection 






43134-B1 




Campylobacter coli 


1 o 

18 


43135-B1 


• 

The American Type Culture Collection 


(Cont'd) 


19 


43136-B1 






1 A 






Campylobacter jejuni 


1 


*5 cm O D1 


TltA A Tnorir>on T\/iw» f'llltllTP fnllp'CtiflTI 

x nc /viricrictiii lypc v^uiiui c i^uiicvuuii 


2 


35919-B1 






3 


35920-B1 






4 


35921-B1 






5 


15Q18-B2 






6 


^920-R2 






7 


^Q22-B2 






8 


35923-B1 






9 


35924-B1 






1 A 
10 


35925-B1 






11 


35925-B2 






12 








13 


35924-B2 






If 


35922-B3 






1 *7 


43133-B1 






18 


43134-B1 






19 


43135-B1 






20 


43136-B1 




Campylobacter 


HP1 




J. Med. Microbiol.1993. 38: 245-249 


(Helicobacter) pylori 








Chlamydia psittaci 


Choi** 




J. Gen. Virol. 1989. 70: 3381-3390 


Clostridium 


CAK-1 




J.Bacteriol. 1993.175:3838-3843 


acetobutylicum 
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Clostridium botulinum 






Nucleic Acids Res. 1 990. 1 8: 1 29 1 








Bioch.Biophys.res.Commun.1990.171.1304- 
1311 








Microbiol.immunol. 1 98 1 .25 :9 1 5-927 








J. VeLMed.Sci. 1992.54:675-684 




CE(3 &CEy 






Clostridium difficile 


41 &56 




J. Clini.Microbiol. 1985.21:251-254 



WO 00/32825 



PCT/IB99/02040 



118 



Clostridium 






Rev.Can.Biol. 1 977.36:205-2 1 5 


perfiingens 














FEMS Microbiol.Lett. 1990.54:323-326 


Clostridium 




8074-B1 


The American Type Culture Collection 


sporogenes 


59 


17886-B1 






70 


17886-B3 






71 


17886-B4 






72S 


17886-B5 






72L 


17886-B6 




Clostridium tetani 


A & B 




Rev.Can.Biol. 1978.37:43-46 


Corynebacterium 






Vopr.Virusol. 1986.3 1 :577-584 


diphteriae 








Corynebacterium 


NN 


12319-B1 


The American Type Culture Collection 


pseudotuberculosis 








Corynebacterium sp 


DLC 2921/49 


12052-B1 


The American Type Culture Collection 
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Enterococcus faecalis 


42 


19948-B1 


The American Type Culture Collection 


Enterococcus faecium 


124 

133 


19950-B1 
19953-b2 
19953-B1 


The American Type Culture Collection 
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Escherichia coli 




11303-B14 
11303-B10 
11303-B21 
8677-B1 

i 1 mi Ti 1 1 

13706-B4 


The American Type Culture Collection 


Escherichia coli 




15766-Bl 


The American Type Culture Collection 


(Print 'ti\ 




15766- Bl 
1242-B5 
15669-B2 

15767- Bl 
11303-B16 
27-65-Bl 






C204 

El 

fl** 

f2** 

FCZ 

fd** 


25065-B2 
15669-B1 






15597-B1 






21816-B1 






23724-B9 

15593-B1 

25404-B1 

29746-Bl 

23631-Bl 

25868-Bl 

25298-Bl 

25298-B2 

11303-B37 

11303-B24 






Ifl** 


1 1 "W?-R2fi 

1 1 JUJ'O^U 






11303-B28 
11303-B29 

11303-B31 
11303-B25 
1 1303-B35 






MS2** 

MU9 

Mu-1 

0x6 

PI** 

P4 Sid; ** 

Q-P** 

R17** 

Zl¥J\ 

ZJ/2 


11303-B34 






11303-B36 
11303-B32 
13706-B5 






11303-B1 






11303-B2 

11303-B3 

11303-B4 

35060-Bl 

35060-B2 

35060-B3 

11303-B5 

11303-B6 

11303-B7 

11303-B38 

12141-Bl 
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Escherichia coli 
(Cont'd) 



547 

UV1 

UV47 

UV375 

cc3** 

X ** 

A.C-17 

X sus P-3 

X sus R-5 

X sus J-6 

X sus 0-8 

X sus A-ll 

X ind" 

^92 

^R 

<z*V-l 

^K174** 

<z$Xcs70am-3 


11303-B20 

11303-B17 

11303-B15 

11303-B11 

11303-B18 

13706-B2 

23724-B2 

23724-B1 

23724-B3 

23724-B4 

23724-B5 

23724-B6 

23724-B7 

23724-B8 

35860-B1 

13706-B3 

15597-B2 

13706-B1 

49696-B1 


The American Type Culture Collection 






G4** & <*K** 




Biochim.Biophysica Acta. 1992.1 130:277-288 


BF23** 




J.Bactenol. 1977. 129:265-275 


Mul 




J.UltrastructRes. 1966. 14:441-448 


Hpl7 




JT.Mol.Biol. 199 1 .2 1 8:705-72 1 


K3** & 0x2** 




FEBS Lett.l987.215:145-150 


Rbl8**,Rb51& 
Rb69** 




J.BacterioL 1 990. 1 72 : 1 80- 1 86 


H1**,H3,H8, 
KIR A: Oxl 




Mol.Gen.Genet. 1990.221:491-494 


Ml** TnTa** A: 

iVl 1 ,1 lild Ob 

Tulb** 




J Mol Biol 1987 196*165-174 


K10 




J.BacterioL 1 979. 140:680-686 


Qsr' 




J.BacterioI.1985.162:256-262 


B278 




J.Gen.Microbiol.l988.134:1333-1338 


phi 80** 




FEMS Microbiol.Lett. 1 994. 1 1 9:7 1 -76 


r>hi m 1 7^ 

\Jlll ill 1 # «/ 




Genetika 1985.21:673-675 


tf-1 




J.Gen.Microbiol. 1 987. 1 33 :953-960 


P4 & phiR73 




Mol.MicrobioL1995.18:201-208 


1,-2 




J.Gen.Microbiol. 1 982. 1 28:2797-2804 


PRD1 




Virology 1990.177:445-451 


K3hx 




Mol.Gen.Genet. 1987.206:1 10-1 15 


933J**& 
933W** 




Infect.Immunity.l986.53:13S-140 " ... 


H19-B** 




J.BacterioL 1 987. 1 69:4308-43 12 


Tcp-lll 




Zentralbnl.Bakteriol.Mikrobiol.Hyg.1988.270: 
41-51 
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Escherichia coli 
(Cont'd) 



N4** 




VetMicrobiol. 1992.30:203-2 12 


Phi 80 trp 




Ann.Inst.Pasteur.1971. 120:121-125 


Obeta 1 




J.Bacteriol. 1978. 1 33 : 1 72- 177 


P1CM 




J.Gen.Microbiol.l978.107:73-83 


PA-2** 




J.Bacteriol. 1 990. 1 72: 1 660- 1 662 


186** 




Mol.Gen.Genet. 1 982. 1 87:87-95 


186.IX.B 




Mol.Microbiol.l992.6:2629-2642 


21** 




Virology 1983.129:484^89 


P4** 




MicrobiolRev.l993.57:683-702 


82** 




J.Biol.Chem.1987.262: 1 1721-1 1725 


PSP3 




J.Bacteriol. 1 996. 1 78:5668-5675 


HK022** 




Nucleic Acids Res. 1994.22:354-356 


D108** 




Nucleic Acids Res. 1986. 14:38 13-3825 


Rb49 




J.Mol.Biol. 1 997.267:237-249 


Ike** 




J.Mol.Biol.l985.181:27-39 


P22dis 




Mol.Gen.Genet 1 978. 1 66:233-243 


N15** 




J.Bacteriol. 1996. 178: 1484-1486 


Ifl** 




Proc.R.Soc.Lond.B.BioLSci. 1991. 245:23-30 


Stx2Phi-I & 




Infect.Irnmun.l998.66:4100-4107 


Stx2Phi-II 






18 




Virology 1987.156:122-126 


X 




J.Gen.MicrobioL 198 1 . 1 26:389-396 


AC3 




MoLMicrobiol. 199 1.5:71 5-725 
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BW-1 




Felix d'Herelle Reference 




C-l 




Centre,Quebec,Quebec 




E9202 








Esc-7-1 1 








H1QJ 








Haiti 
nam 








































Y T , 








M 








Mil** 








O103 








0157:H7 








P1D 








ptl 








PilHa 








ri\\rtr o 








PR772 








SS4 








B40 








A,vir" 








Q8 








09-1 








92 






Haemophilus 


HP1** 




Nucleic Acids Res. 1996.24:2360-2368 


influenzae 


CO** 




Gene 1997 196: 139-144 


Halobacterium 


S45 




Felix d'Herelle Reference 


cutirubrum 






Centre,Quebec,Quebec 


Halobacterium 






Felix d'Herelle Reference 


halobium 






Centre,Quebec,Quebec 








Can.J.Microbipl. 1 982.25.!* lo-92 1 


Halobacterium 






Biol.Chem.Hoppe Seyler 1994.3J5T747-757 


salinarium 
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Klebsiella oxytoca 


tf-1 




LGen.Microbiol. 1 987 . 1 33 :953-960 


Klebsiella pneumoniae 


60 
92 


23356- Bl 

23357- Bl 


The American Type Culture Collection 




K19Q 




Felix d'Herelle Reference 
Centre,Quebec,Quebec 




FC3-1 & FC3-9 




Can. J.MicrobioL 1991 .37 :270-275 




FC3-10 




FEMS Microbiol.LetU 991.67:291-297 


Klebsiella sp. 


Kll* 




Mnl frpn Genet 1990 221*283-286 


Leptospira sp. 


Lbl, Lbi ot Lii*t 




Re* Microbiol 1990 141*1131-1138 


Listeria 


243 


23074-B1 


The American Type Culture Collection 


monocytogenes 


197,1313 & 
9425 




AppLEnviron.Microbiol. 1 997.63 :3374-3 377 




H387 & H387-A 




Appl.Environ.Microbiol. 1993.59:2914-2917 




5775,6223 
&12682 




APMIS.1993.101:160-167 




2389, 2671, 
4211 & 2685 




Intervirology 1994.37:31-35 & 
Zentralbl.Bakteriol.Mikrobiol.Hyg. 1986.26 1 : 1 
2-28 




4b.4ab,4g & 3c 




Ann.Microbiol (Paris) 1977.128:185-198 




All 8, A500& 
A511** 




MoLMicrobiol. 1995.16:1231-1241-992 




1,3,4, 5, 6,7, 8, 
9, 10,11,14,15, 
16. 17, 19&20 




Ann.Microbiol (Paris) 1979.130B: 179-189 




l/2a, l/2b, 3c, 
4ab. 6a & 6b 




Clin.InvestMed. 1 984.7:229-232 




4>LMUP35 
2685 




Felix d'Herelle Reference 
Centre Ouebec Ouebec 


Listeria innocua 


4211 




Felix d'Herelle Reference 
Centre,Quebec,Quebec 


Micrococcus luteus 


N3 
N4 
N8 


4698-B1 
4698-B4 
4698-2 
4698-B3 


The American Type Culture Collection 


Micrococcus luteus 


NI7 




Can.J.Microbiol. 1 979.25: 1027-1 035 


Mycobacterium 
smegmatis 


BK-3 
Bol** 
Bo 6 
Bo 611 

Bo 6in 
Mc-2 
Mc-4 
NN 

Phagus lacticola 
Rl 


27203- Bl 

27204- Bl 

27205- Bl 
27205-B2 
27205-B3 
607-B6 
607-B7 
11727-Bl 
11759-Bl 
607-Bl 


The American Type Culture Collection 
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HER 3 17 
HER 330 
HER 333 

UT?T> TIC 

HER 335 
HER 334 
HER 331 
HER 316 


Felix d'Herelle Refrence 
Centre,Quebec,Quebec 




Legendre 
Leo 
Roy 
Sedge 


> 










Mol.Microbiol. 1993.7:395-405 








J.Mol.Biol.l998.279:143-164 








Proc.Natlj\ca<LSciUSA.1988.84:2833-2837 








Mol.Biol.Rep. 1981.30:11-15 








Proc.Natl.Acad.Sci.USA 1997.94:10961- 
10966 




2S/M, JIM, IZZ, 
154, 37, 29D, 46, 
139,110, 141, 
74D, AG1& 
DS6A 




Arcn.viroi.iyyj.lij:jy-4v « 
AmRev.Respir.Dis. 1 975. 1 12: 17-22 


Mycobacterium 
fortuitum 


Bo 4 
Bo 7 


23052-Bl 
27207-Bl 
27207-B2 


The American Type Culture Collection 
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Mycobacterium leprae 






Ann.Microbiol. (Paris) 1982.133:93-97 


Mycobacterium 
tuberculosis 


DS6A 


25618-Bl 
25618-B2 
4243-Bl 


The American Type Culture Collection 




110, 139&33D 




Arch.ViroU 993.133:39-49 




AG1,GS4E, 
BG1, 

PH&BKl 




The Biology of Mycobacteria.Academic 
Press/Toronto 1982 (Ratledge & Stanford) 
1982.309-351 


Mycobacterium sp 


Phagus pellegrini 
NN 

Bi 


11760- Bl 

11761- B1 
23239-B1 


The American Type Collection Culture 
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TM4, ph60, 
pn/2, 

t>U A "CIO 

ohAE40 
&Bxbl 




Microbiology 1995.141:1173-1181 




C2 




Experentia 1969.25: 1112-1113 




18&I15 




J.Gen. ViroU987.68:949-956 




£1 
Oj 




UxUZllCa iyoo.jo.oi /-ozz 




phlei & 
butyricum 




J.Gen. Virol. 1975.29:235-238 




MvF3P-59a 




Z.AllR.Mikrobiol. 1 968.8:29-37 




Bo2a 




J.Gen. ViroM973.20:75-87 




D4.D28 & D32 




J.Exptl.Med.l966.123:327-340 




HC 




J.Bacteriol. 1963.86:608-609 


Mycobacterium 
vaccae 


B5 


15483-B1 


The American Type Culture Collection 


Mycobacterium phlei 


NN 
Bo 2 

DO JUl 
JDU J 


H728-B1 
11758-B1 

27086-B1 


The American Type Culture Collection 


Mycoplasma 
arthritidis 


MAV1** 




Infectlmmunity. 1 995.63:40 1 6-4023 


Mycoplasma hyorhinis 


Hr-1 




Arch.Virol.l983.77:81-85 


Mycoplasma 
pneumoniae 


Br-1 




Arch.Virol.l983.75:l-15 


Mycoplasma pulmonis 






Plasmid 1995. 33: 41-49 


Mycoplasma sp. 






J.GeiLMicrobiol. 1985: 131:31 17-3 126 








J. Virol. 1986.59:584-590 








Gene 1994. 141: 1-8 
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Microbios 1990. 64: 111-125 






Infection& Immunity 1995. 63: 4016-4023 






Med.BioL1982.60:l 16-120 


MV-L2& 




Arch.ViroU979.61:289-296 






Acta.ViroU978.22:443-450 






J.Gen. Virol. 1979.42:3 1 5-322 






Virology 1973.55:118-126 
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Science 1971.173:725-727 


Neisseria perflava 






J.Clin.Microbiol.1976. 4:87-91 


Nocardia erythrypolis 


ipC 




J.Gen. Virol. 1 974.23 :247-254 


cpEC 




J.Bacteriol. 1976. 126: 1 104-1 107 


Pasteurella multocida 






Arch.Exp.Veterinarmed. 198 1 .35:433-436 


B939a 




Am.J.VetRes.l978.39:1565-1566 


Nos.115,32, 967 
& 

1075 




Vet.Med.Nauki. 1977.14:33-36 


Propionibacterium 
acnes 


NN 


29399-B1 


The American Type Collection Culture 
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Pseudomonas 
aeruginosa 





12175-B1 


The Amprican Tvne (Culture f^nllprtinn 


{. 


12175-B2 






1X1/ J UJ 




2B 


12175-B4 




1 1 
1 1 


14205-B1 




1 u 


14206-B1 




&<\ 


14207-B1 






14208-B1 




44 


14209-B1 




ID 


14210-B1 






14211-B1 






14212-B1 




1 1 J 


14213-B1 






14214-B1 






15692-B1 




HofF2 


14203-B1 




Hoff3 


14204-B1 




Pa 


12055-B1 




Pb 


12055-B2 




PB-1 


15692-B3 


r 


Pc 


12055-B3 




Pf 


25102-B1 




PP7** 


15692-B2 








Felix d'Herelle Reference 






Centre,Quebec,Quebec 


7&31 






Pf3** 




J.Virol.l983.47:221-223 


cp-MC 




Can.J.Microbiol.l969.15:1179-1186 


pn** 




J.Mol.Biol. 1 99 1 .2 1 8:349-364 


PR4** 




J.Gen.Virol.l979.43:583-592 


A7 




J.Bacteriol.l992.174:2407-2411 


KF1 




J.Biocheral983.93:61-71 


<zCTX** 




Mol.Microbiol 1 993.4: 1 703- 1 709 


f2+* 




J.Viroi.l977.24:135-141 
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dd<. 



cpKZ,21,(pNZ, 
PMN17, PTB80, 
68, PB-1,E79, 
16, 

109,352,1214, 
F8,71,337, M4, 
(pC17, SL2,B17, 
Li-24, <pmnP78, 
PS17**,<pl,73, 
M6, Li-2, 7, 
(pmnF82, 
PTB2, PTB20, 
PTB42, <pKF77, 
31.PTB21, 
H9x, 

<pPLS27, B3, 
258, 

Hwl2,PM57, 
PM62,PM105, 
148,PM681, 
198, 

218, 222, 242, 
246, 

PC131,(pCll, 
SL5, 

D3112**,Jbl9, 
F7, 

PM69,PM13, 
PM61,PM113, 
q>240, 249 & 269 
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Pseudomonas 


297, 309,318, 




Arch.Virol.l993.131:141-151 


aeruginosa 








(Cont'd) 
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Pseudomonas cepacia 






Felix d'Herelle Reference 
Centre,Quebec,Quebec 


Pseudomonas fragi 


wy 


27362-Bl 
27363 Bl 


The American Type Culture Collection 


Pseudomonas 
phaseolicola 


§6 




Felix d'Herelle Reference 
Centre,Quebec,Quebec 


Pseudomonas putida 


Sh-l 


12633-B1 


The American Type Culture Collection 


Pseudomonas syringae 




40492-B1 
21781-B1 


The American Type Culture Collection 


Pseudomonas sp. 


PPs-G3 


49780-B1 


The American Type Culture Collection 


Salmonella bareilly 


Sab 2 




Felix d'Herelle Reference 
Centre,Quebec,Quebec 


Salmonella enteritidis 


1.2.3&6 




EpidemioLInfect 1995.1 14:227-236 


2a, 3a, 4a, 5a, 6a, 
7a, 8a, 9a, 15, 
19, 20 &21** 




VetMed.Nauki. 1975. 1 2:55-60 


Salmonella newington 


Eosilon 34 




J Struct Biol 1 995 1 1 5 *283-289 


Salmonella newport 


16-19 


27869-B1 
27869-B2 


The American Type Culture Collection 






Felix d'Herelle Reference 
Centre,Quebec,Quebec 


Salmonella paratyphi 


Paratyphoid A 


19940-B1 
12176-B1 


The American Type Culture Collection 


Jersey 




Felix d'Herelle Reference 
Centre,Quebec,Quebec 


Salmonella 
senftenberg 


SasLl, SaL2, Sal 

i 
j> 

SaL4, SaL5 & 
SasL6 




Indian J.Med.Res. 1997.105:47-52 


Salmonella 
typhimurium 


P22** 
SL-1 


19585-B1 
40282 


The American Type Culture Collection 


MB78** 




J.Virol. 1982.41: 1038-1043 - 


SE1 




J.Gen.Microbiol.l986.132:1035-ld41 


LT2 




Virology 1971.45:835-636 


ES18** 




Virology 1970.42:621-632 


L** 




J.Virol.l985.56:1034-1036 
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P1CM cir-100 




MoLGen.Genet. 1 975. 1 38: 1 1 3- 126 




F22 




GenetRes. 1 986.48 : 1 39- 1 43 




Feb 1 




J.Gen.ViroL1978.38:263-272 




Fels2 




Genet.Res.l986.48:139-143 




Pv 




ivioi.vjcii.oenet. iy /u.iuo. lo^-zuz 




r 1KC 




viroiogy iy /4.0U.DUJ-D 14 








T T}«ar>tf»T-tr»1 1QQ7 |<Q. 1 AA"! 1 AAA 

j.oacienoi. iyo/.ioy. iuuj-iuuy 




TJX 

rii 




rionot Dor 107£77*71C T>"> 


LJLitirn/fiam 


fRA 






typhimurium 


MnHI 

1V1UU1 




MaI n#»n dmrt IQRfi 7A7-"577 lift 


(Cont'd) 


P77 ft*irA 1 /»ir^ 

rZ/ {cir^-i, C1TJ- 




Moi.uen.uenet. iyo4.iyo:lUj-lOy 




1 &cir6-l) 








BF23** 




Mol.Gen.Genet.1976. 147: 195-202 




Kbl 




J.Bacteriol.1974.1 17:907-908 








J Gen Viml 107R 41 *^fi7-17A 

J.VJCU. V 11 Ul. 17/ O.H 1 .JU /—J / o 




pp r» i * * 




viroiogy iyyu.1 //:4*o-4oi 




T 0** 




j.uen.MicroDiol.lyoz. 128:2797-2804 




tf-1 




J.Gen.Microbiol. 1 987. 1 33 :953-960 




X** 




J.Gen.Microbiol. 1981.1 26:389-396 


Salmonella 


o 
5 


1993 /-rJl 


The American Type Culture Collection 


tvnhn^/i/tvnhi 


23 


19938-B1 




25 


19939-Bl 






46 


19942-Bl 






53 


19943-Bl 






163 


I9946-B1 






175 


I yy*\ /•Di 






Vil 








ViVI 


27870-R2 






01 




Felix d'Herelle Refrence 








Centre,Quebec,Quebec 




Vill 




Chung Hua Liu Hsing Ping 








HTP 1009 11'78S 








j.uen.Microoioi, 1 yo3. izy :33yo-334UU 


V/r/m/lMi>///i on 
iJCilfilUfldlU iy. 


PI 




The American Type Culture Collection 




P4+* 


7SQS7-R7 






P9a 


25957-B3 






P9c 


25957-B4 






P10 


25957-B5 






102 


19945-B1 






Chi(x) 


9842-B1 






R34 


97541 






MG40 




Virology 1968.34:521-530 




P14 




Microb.Pathog. 1990.8:393-402 




PSP3 




Virology 1992.188:414 _ - 




Ike** 




Zentralbl.Bakteriol. 1 976.234:294-304~~ 




P27&9NA 




J.ViroU986.12:921-931 


Sphaerotilus natans 


SN1 




Appl.Environ.Microbiol. 1 979.37: 1025- 1 030 
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Shigella dysenteriae 




23351-B1 


The American Type Culture Collection 






11456b 










11456a-Bl 






— : n : 

Shigella, flexeneri 




12661-B1 


l ne Amen Can i ype i^uuure ^ojic^iiou 




Mil 




Mol.MicrobioL 1997.26:939-950 




ol v 




Gene 1997.22:217-227 








Mol.MicrobioL1995.18:201-208 




SfX 




Gene 1993.129:99-101 


Shigella sonnet 


C16** 








Ufa 




MoLBiol (Mosk) 1977.11:323-331 


Shigella sp 


37 


23354-B1 


The American Type Culture Collection 


Spiroplasma citri 


SdVI 




Plasmid 1993.29:193-205 


Spiroplasma sp. 


SpVl-R8A2B 




Nucleic Acids Res. 1990.18:1293 


SdV3 




Isr.J.Med.Sci.l987.23:429-433 




SdV4 




J.Bacteriol.l987.169:4950-4961 


Staphylococcus albus 






Staphylococci & Staphylococcal 








Infections. 1997. 










Voll:503-508 (Karger,Basel) 
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Staphylococcus aureus 





27702-B1 




27 /U3-B1 




nnAyl D 1 

2/ /04-Dl 




233oO-Bl 




23361-B1 


1 J 


"»*7*7AC n 1 

27705-B 1 


1 "7 


W 11 t)1 

27712-Bl 




27690-B 1 


42U 


2769 1-B1 


42C 


27692-B1 


4/ 


27693-B1 


32 


27694-B1 


32A 


27695-B1 


CI 

53 


27696-B1 


34 


27697-B1 


33 


27698-B1 


/I 


27699-B1 


/3 


27693-B2 


T7 
/ / 


1T7AA O 1 

27700-B 1 




WA1 TD 1 

27701-B1 


oil 


27706-B 1 


0 t 


27707-B1 


83A 


27708-B1 


o4 


33742 


85** 


33741-B1 


88 


15565 


92 


19685-B1 


5504* 


11987-B1 


K 


11988-B1 


PI 


15752-B1 


P14 




UC18 





The American Type Culture Collection 
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HER 101 
HER 239 
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Table 2 



^Bacteriophage 77, complete genome sequence, 41708 nucleotides 

1 gatcaaaata cttggggaac ggttagggag taaacttcgc gataatttta aaaattcatg 

61 tataaccccc ctcttataac cattttaagg caggtgatga aatggagatt atagtcgatg 

121 aaaatttagt gcttaaagaa aaagaaaggc tacaagtatt atataaagac atacctagca 

181 ataaattaaa agtagttgat ggtttaatta ttcaagcagc aaggctacgt gtaatgcttg 

241 attacatgtg ggaagacata aaagaaaaag gtgattatga tttatttact caatctgaaa 

301 aggcgccacc atatgaaagg gaaagaccag tagccaaact atttaatgct agagatgctg 

361 catatcaaaa aataatcaaa caattatcgg atttattgcc cgaagagaaa gaagacacag 

421 aaacgccatc tgatgattac ctatgattag taataaatac gttgatgaat atataaattt 

481 gtggaaacaa ggaaagataa ttttaaataa agaaagaatt gatctcttta attatctaca 

541 aaaacatata tattcacgag atgatgtata ttttgatgaa cagaaaatcg aggattgtat 

601 caaatttatt gaaaaatggt attttccaac attaccattt caaaggttta tcatagctaa 

661 tatatttctt atagataaaa atacagatga agctttcttt acagaatttg ctattttcat 

721 gggacgtgga ggcgggaaaa acggtctaat aagtgctatt agtgattttc tttctacgcc 

781 cttacacgga gttaaagaat atcacatctc cattgttgct aatagtgaag atcaagcaaa 

841 aacatcgttt gatgaaatca gaaccgtttt aatggataac aaacgaaata agacgggtaa 

901 aacgccaaaa gctccttatg aagttagtaa agcaaaaata ataaaccgtg caactaaatc 

961 ggttattcga tataacacat caaacacaaa aaccaaagac ggtggacgtg aggggtgtgt 

1021 tatttttgat gaaattcatt atttctttgg tcctgaaatg gtaaacgtca aacgtggtgg 

1081 attaggtaaa aagaaaaata gaagaacgtt ttatataagt actgatggtt ttgttagaga 

1141 gggttatatc gatgcaatga agcacaaaat tgcaagtgta ttaagtggca aggttaaaaa 

1201 tagtagattg tttgcttttt attgtaagtt agacgatcca aaagaagttg atgacagaca 

1261 gacgtgggaa aaggcgaacc caatgttaca taaaccgtta tcagaatacg ctaaaacact 

1321 gctaagcacg attgaagaag aatataacga tttaccattc aaccgttcaa ataagcccga 

1381 attcatgact aagcgaatga atttgcctga agttgacctt gaaaaagtaa tagcaccatg 

1441 gaaagaaata ctagcgacta atagagagat accaaattta gataatcaaa tgtgtattgg 

1501 tggtttagac tttgcaaaca ttcgagattt tgcaagtgta gggctattat tccgaaaaaa 

1561 cgatgattac atttggttag gacattcgtt tgtaagacaa gggtttttgg atgatgtcaa 

1621 attagaacct cctattaaag aatgggaaaa aatgggatta ttgaccattg tcgatgatga 

1681 tgtcattgaa attgaatata tagttgattg gtttttaaag gctagagaaa aatatgggct 

1741 tgaaaaagtc atagctgata attatagaac tgatattgta agacgtgcgt ttgaggatgc 

1801 tggcataaaa cttgaagtac ttagaaatcc aaaagcaata catggattac ttgcaccacg 

1861 tatcgataca atgtttgcga aacataacgt aatatatgga gacaatcctt tgatgcgttg 

1921 gtttactaat aatgttgctg taaaaatcaa gccggatgga aataaagagt atatcaaaaa 

1981 agatgaagtc agacgtaaaa cggatggatt catggctttt gttcacgcat tatatagagc 

2041 agacgatata gtagacaaag acatgtctaa agcgcttgat gcattaatga gtatagattt 

2101 ctaatagagg aggtgagaca tgagtattct agaaaagata tttaaaacta ggaaagatat 

2161 aacatatatg cttgatttag atatgataga agatctatca caacaagcgt atgtgaaacg 

2221 tttagcgatt gatagttgta ttgaatttgt tgcgcgagct gtcgctcaaa gtcattttaa 

2281 agtattggaa ggtaatagaa ttcaaaagaa tgatgtttac tacaagttaa atataaaacc 

2341 aaatactgac ttatcaagcg atagtttttg gcaacaagtt atatataaac taatttatga 

2401 taacgaggtt ttaatcgtag taagtgacag caaagaatta cttatcgcag atagctttta 

2461 cagagaagag tacgctttgt atgatgatat attcaaagat gtaacggtta aagattatac 

2521 ttatcaacgt actttcacaa tgcaagaggt catatattta aagtacaaca acaataaagt 

2581 gacacacttt gtagaaagtc tattcgaaga ttacgggaaa atattcggaa gaatgatagg 

2641 tgcacaatta aaaaactatc aaataagagg gattttgaaa tctgcctcta gcgcatatga 

2701 cgaaaagaat atagaaaaat tacaagcgtt cacaaataaa ttattcaata cttttaataa 

2761 aaatcaacta gcaatcgcgc ctttgataga aggttttgat tatgaggaat tatctaatgg 

2821 tggtaagaat agtaacatgc ctttttctga attgagtgag ctaatgagag atgcaataaa 

2881 aaatgttgcg ttgatgattg gtatacctcc aggtttgatt tacggagaaa cagctgattt 

2941 ggaaaaaaac acgcttgtat ttgagaagtt ctgtttaaca cctttattaa aaaagattca 

3001 gaacgaatta aacgcgaaac tcataacaca aagcatgtat ttgaaagata caagaataga 

3061 aattgtcggt gtgaataaaa aagacccact tcaatatgct gaagcaattg acaaacttgt 

3121 aagttctggt tcatttacaa ggaatgaggt gcggattatg ttaggtgaag aaccatcaga 

3181 caatcctgaa ttagacgaat acctgattac taaaaactac gaaaaagcta acagtggtga 

3241 aaatgatgaa aaagaaaaag atgaaaacac tttgaaaggt ggtgatgaag atgaaagcgg - 

3301 agattaaagg cgtcatcgtt tccaacgaag ataaatgggt ttacgaaatg cttggtatgg 

3361 attcgacttg tcctaaagat gttttaacac aactagaatt tagtgatgaa gatgttgata 

3421 ttataattaa ctcaaatggt ggtaacctag tagctggtag tgaaatatat acacatttaa 

3481 gagctcataa aggcaaagtg aatgttcgta tcacagcaat agcagcaagt gcggcatcgc 

3541 ttatcgcaat ggctggtgac cacatcgaaa tgagtccggt tgctagaatg atgattcaca 
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3601 
3661 
3721 
3781 
3841 
390,1/ 
-3961 
4021 
4081 
4141 
4201 
4261 
4321 
4381 
4441 
4501 
4561 
4621 
4681 
4741 
4801 
4861 
4921 
4981 
5041 
5101 
5161 
5221 
5281 
5341 
5401 
5461 
5521 
5581 
5641 
5701 
5761 
5821 
5881 
5941 
6001 
6061 
6121 
6181 
6241 
6301 
6361 
6421 
6481 
6541 
6601 
6661 
6721 
6781 
6841 
6901 
6961 
7021 
7081 
7141 
7201 
7261 
7321 
7381 
7441 
7501 
7561 
7621 



atccttcaag 
aacatgttgg 
aacttataga^ 
gttttgcgga 
aagtgttatc 
"^ttaacattga 
aggaatcaga 
ttttttaata 
aatgcgaaaa 
gaattgtacg 
gaagctgaaa 
aatttcttta 
gaaacaattg 
ggtattaaaa 
gtctggggta 
acagcaattc 
ggtcctgcgt 
cttgaaactg 
gtacaaaaag 
cttacatttg 
tcaactaacg 
ccgtccgatg 
gttactgctt 
gttttaacgt 
aaatttaaag 
tacggcaaag 
aaaccagctt 
gaaatttaaa 
aggggagttg 
aatcaaaaat 
attattagaa 
aatcatcgac 
aatcacttga 
tgtcgtacga 
aagaattgat 
acaattacag 
aagaaagtgt 
tataagtata 
agctgttggg 
ggaacgcaaa 
gaagaacatt 
gtatcaccag 
gtgtgaaagt 
aagagatggt 
aaataaaaaa 
gtactgaacc 
ttgaacgatt 
aatttgtaaa 
agtattttga 
tcatgaagtg 
gttcaataaa 
cgacgaccca 
ccaaatagat 
gatatctaat 
tggaaaaccg 
cattttttac 
tattaacatt 
tagtgatatt 
aaaaacagct 
aatctcatta 
ttatgatgaa 
atggttcaga 
gtttacaaat 
agaggttgaa 
tatctttgat 
tttcttaaag 
aactttgtaa 
agcattaaaa 



tattgcgcag- 
tcaaat^atg 
aatgatggct 
tagtaaaatg 
gaaagatgta 
tattgacgca 
aatcgatgtt 
caaaaatagg 
acgaatttat 
gtgacatgat 
gagtttctag 
tggatatcaa 
atagaatctt 
atgctggttt 
aaatctatgg 
aaaataaatt 
ggattgaaag 
cgttcttaaa 
gtgtatcggt 
ctaatccgcg 
agaaaggtaa 
cttttgaggt 
taccatttaa 
acgttaaagg 
aaacacttgc 
cgaaagataa 
tagaagatac 
gtcgttagag 
tatccagctg 
aagtacgaca 
ctatgcgaat 
ttattgaatg 
aaagattgac 
gcgtataaaa 
acttatacgc 
acctgaaata 
ttaagaaacc 
ctgaaaataa 
cgagtattga 
atgacattaa 
atcttgaaat 
atttggataa 
gacaggtgat 
aaaagttcaa 
acaactcaaa 
tgaatggata 
tagaatagta 
acctaaagct 
gacgctaaaa 
attagtcaag 
taccctaatg 
atacctacaa 
gtttttgtta 
cgcattcaaa 
gaatatatag 
aaggaggaaa 
actggtttag 
acaaaaacaa 
tatgctgatg 
caaatgcatg 
gatggcgttt 
caagagcgta 
cctaaaatcg 
ggtgaggcac 
tcagctaaca 
aaaattttag 
caaaaccggc 
cacttaaagt 



'ggagaagtga 
gctgaggcat 
aaggaaacgc 
tttgaaaacg 
ttaaatcgtg 
atagcaaata 
gcagatagta 
aggtcataaa 
taatgcagta 
taaccaacta 
ttcacctaaa 
taagagtgtt 
cgaagattta 
gcgtttgaag 
tgaaattaaa 
gacagcgttt 
atttgttcgt 
aggtactggt 
aactgatggt 
cgctacggtt 
atcagtagcg 
tcaagcacag 
tttgaatgtt 
tctatatgat 
gttagatgat 
taaagttgct 
cgaagaaaca 
aatttaaaga 
aagggtataa 
aagtttatat 
cattacaaaa 
gtgaagacaa 
cataattcag 
aatcagtgcg 
gctagatatg 
atagattttt 
tagaattaca 
tggtccagaa 
tggtgtctgg 
attgtatatt 
tgaaccaaga 
taaagacttt 
aaagcattag 
gataaggcgt 
ccttcagaag 
aaggggaaac 
catttaattg 
atgggtggga 

agggagttga 

acagaattat 
taaaagatac 
cttatactga 
agtacaatga 
agttattatg 
aagaatttaa 
attaaatggc 
gtttcgctaa 
gaggattaca 
gcggtccaat 
cgttccctaa 
acgaagagaa 
aagacggtac 
atggagaaac 
ttttcccttt 
tgacaaatca 
gcgaagaata 
ttcatcggaa 
tggcgacaca 
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aagatctaaa 
atgcggttag 
ggctaaatgc 
acaatatgca 
taacagcttt 
aagtaattga 
aattatcagc 
atgactataa 
aacaacggtg 
tttgaagaaa 
tcagcacaaa 
ggatataaag 
acaacgaatc 
ttcttaaaat 
ggtcaattag 
gttgttttac 
gttcaaatcg 
aaagaccaac 
gcttatccag 
aatgaattga 
gttaaaggta 
tatacacatt 
attgagtcta 
ggttatttag 
atggatttat 
gctgtttgga 
ctataaaatt 
catagagcac 
caatcctcgt 
cgtaccttta 
aaaagcgtct 
tgacgattga 
aggatgagta 
gagtttttga 
cttatcaaga 
cgttatctct 
actaaacgtt 
gctggagaaa 
ttacgtgaat 
cgtgatccgc 
tatttcaaaa 
attatgattc 
aaagagaatt 
taatagctgg 
actcaggagc 
gtactgttac 
aaaatggtca 
ttaacagagc 
aaaaattgtg 
tagagagcac 
tgatgtacct 
cggagatgag 
tgaatataat 
gtctgaacta 
aacatataga 
agtaaaacat 
attaacgaaa 
aaaaattggt 
tgaatcaggg 
agagattcgc 
acaaggtaaa 
atttagaaca 
ggctgagaaa 
agttgataat 
tgatggagac 
tactggaaac 
actgcggtaa 
tacgatttaa 



tcatgctgca 
agctggtaaa 
tgatgaagcc 
aattgtagca 
ggtaagtaaa 
aaaaataaat 
aaatggattt 
atttatcgga 
aaccgcaaga 
ctaaattaca 
ctttgagtgc 
aagaaaaact 
atccattatt 
ccgaaacttc 
atgctgcgtt 
caaaagattt 
aagaagcatt 
cgattggctt 
agaaagaaga 
cgcaagtgtt 
atgtaacaat 
taaatgcaaa 
cagttcaaga 
ctggtggtat 
acactgcaaa 
aattagattt 
ttatgaggtg 
aatcaacaca 
gttgaattgt 
gataagctga 
agttcaatgg 
tgatttgctt 
cttaaagcag 
attagagaat 
tttattagaa 
aatggaggta 
taaatacgcg 
aagaagaaaa 
tagaacaagc 
aaggtgatta 
atcgtttgaa 
gcggaggata 
agaaaaacat 
tgctaaggta 
actgattagt 
aattaggtgg 
tgttgagaaa 
aataagacaa 
attgatattt 
gtaaatatca 
tttattgtta 
tgtgcatata 
gcgagaatca 
aaaatgggaa 
agctctcgcg 
gcaagtgcgc 
gaaggcgcgg 
gttgaaactg 
aatacagacg 
aaaattgttt 
caaaacaatt 
gttttattac 
gattgggatt 
aaaaagtcag 
ggtgaaaaag 
gtgacagagg 
agtcggttaa 
atgttgtagt 



gaaacattag 

aacaaacaag 

attgaacaag 

agcgatacac 

acgccagagg 

atgaaagaaa 

tcaagattcc 

aacattcgca 

aagacaaaat 

agcaaaagca 

aaaccaaaga 

tttaccagaa 

agctgactta 

tggcgtggct 

cagtgaagaa 

aaatgatttt 

tgcagtggcg 

aaaccgtcaa 

acaaggtacg 

taaataccac 

ggttgttaat 

tggcgtatat 

agcaggtaag 

taatgttcag 

acaatttgct 

aaaaggacat 

ataaaatggt 

agtacaaagt 

tgacaaatca 

caaaacaaga 

ttaaaagtga 

gtcaaattta 

ttgttaaaaa 

ttaataggtc 

cacttcaacg 

tcagaagatg 

tgttcatttt 

attattatat 

tatctcaaac 

tttacccagt 

tataaagcaa 

tagttcatga 

tttggcataa 

attgttgaag 

gagattggtc 

cgtgggcctt 

aagtcaggaa 

gggcaaaata 

tgtacaaagt 

ataatattaa 

ttgacgatat 

gttatattgt 

taagaaataa 

atgtttcaaa 

tttacgaggg 

caaaggcgta 

aattaaaata 

gtggagaact 

gagaaggtaa 

ttaatgaaga 

acgtagctgt 

ctaaagttat. 

tctcaagtga 

tacgtaagta 

gcgaagaggc 

gtaacgaaga 

tataccagat 

agagccatct 
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7681 
7741 
7801 
786-1' 
^7921 
7981 
8041 
8101 
8161 
8221 
8281 
8341 
8401 
8461 
8521 
8581 
8641 
8701 
8761 
8821 
8881 
8941 
9001 
9061 
9121 
9181 
9241 
9301 
9361 
9421 
9481 
9541 
9601 
9661 
9721 
9781 
9841 
9901 
9961 
10021 
10081 
10141 
10201 
10261 
10321 
10381 
10441 
10501 
10561 
10621 
10681 
10741 
10801 
10861 
10921 
10981 
11041 
11101 
11161 
11221 
11281 
11341 
11401 
11461 
11521 
11581 
11641 
11701 



^tgagtgaca ctataa Jat 

ttgaaaataa ggagagtatt 
agaag atcca 

atttgaaatt gtatacgaag 
9atgaa 9Cca agaga J tc J 

ccaattcaca gttaaagacc 
tcgtgaacaa gtgattttca 
ccagaacatg aaftaaajcc 

tttca^' 93 t »«£X 
tttcactatg tgctttccat 
gaggctttaa ttgatgcatt 

EST* 

cagatgcagc aaatttaaa.- 
acaaa a g 

atctagccaa gcaatatgac 

tacaaaaaac atcagccqaa 
cggcagaaag tggjggg 8 
caaaa t9at3 | 99 ^ 
«gttttagg tattgcaoca 
atactgtcac tcaagcafca 
ttaaagatgt ttatlgcaat 
aagttaatac aagg??"- 

2K2" 9 «S2E 

caatgggcga tgcagqtate 

SJcgctccaat gagagctftf 
99 3 aaaag tc aggcfttaat 

=SSSS5S 

asss js~= 

"gattggtt ttcfaattta 
"gctgctgc aattggtcct 

aatctgaaac atttS««* 
cgatatcagc aatagt C g at 
tat" 3330 " 

tatttgaatt tattttaaat- 

= 3535 

SS2B ssas 

|P S - 225S5 

tcaattcttt aactaaa|gt 



atacacaaca 
agcacaaggc 
aaatgtagaa 
ataaaatggc 
aaattaaatt 
caatggattt 
ctgacagatt 
taaaagaacg 
ttactcaagg 
tgaagattta 
tgaaaatggt 
atatcaaaat 
ttaaccttaa 
atgggagaaa 
agatcatttg 
9gcaacaact 
cttgatggaa 
aaggtatctc 
aacaaacaag 
tttgaagagt 
aaaaccagta 
aaatccattg 
gcatcaggaa 
99cgcaacag 
tttccagcag 
tttacaggta 
99ttctgacg 
gaagcaagtg 
ataagtgttg 
ggccttgaga 
actgaaatag 
aacccaagag 
agcgcaacaa 
gctatcaaag 
ggcacagtaa 
atgaataaat 
cccgtaatgg 
agtgatggtt 
gtagtttttg 
ccattgttag 
cctatattag 
ttggctggtt 
tttgttaatg 
caacctttcg 
ttcgcaaaag 
gttcaagcac 
tttgtaatta 
gttaaagcct 
aatatcatac 
gtttggg aC g 

caattatggt 
ggattgatag 
atttggaatg 
acaaatatga 
aaagcgcagt 
aaagaaattt 
aatacggtag 
cgcgatggct 
gctattaaaa 
ggaatggata 
ttagttaaga 
ggaaatggtc 
atcacaccta 
gcacaaactt 
aaagatatta 
ggtaccaaat 
aaacttttaa 
at gggaattg 
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gatcaaacga 
attgctacgg 
gcataagagg 
aaaactaaaa 
acaaacgtac 
aatcgatgat 
gatggatatg 
tatgcatgca 
tcaacaaact 
acatataaag 
aaagacgcta 
aaaaataatg 
ccgtttggtt 
gaataaaagg 
cagaaatcaa 
tcaaatatac 
ctatcacagg 
aagaacaggg 
caaatgagct 
tcaaaaaagc 
aagtttttga 
gtaaaggttt 
aagcttttgc 
gcagtgaatt 
atgctgaaac 
aagaacttga 
9tgtgcaagc 
aatatcaaag 
atacattagc 
tgaaagaatc 
cattcagtgg 
aagaatttaa 
gtttagcgat 
gtggtcgctt 
accaaacatt 
taaaattagt 
aagaattaat 
ctaaaagatc 
ggttaggtgc 
ctagtattgc 
gaactgtctt 
tagcagtcgc 
gtgcaattga 
ttgattctgt 
atatttggag 
ttcaaaatat 
aaccaattat 
tgattgtcag 
ttggcttgat 
c cgttgtgat 
ttgtaggtaa 
caggaatttg 
caacaaaaag 
aaaattggtt 
cattacttag 
ttagtaattt 
gaattgcaag 
tgagttccat 
aaggacttaa 
aaatacctaa 
acggtaagat 
caaatggttt 
atacagatac 
attcaatgtt 
aatctggtgc 
ggcttggcga 
attatatact 
caggcgacat 



atattgtatc 
ttaaagcaac 
gggcaacccc 
cgtaacatta 
ttaacaccac 
attgaggacg 
gttgtaaaaa 
cctgatggaa 
gaggaaacta 
caatgttgaa 
acgaagtttt 
acatttctga 
agggttatct 
tttatctata 
acgaaacttt 
cgaaaaatca 
ttataagaaa 
cgaaaacagt 
gaattattta 
tcaagttgaa 
aagtatggga 
gatgattggt 
agaagttgat 
aaaaaaattg 
tgttggtgga 
aaatgccaca 
cgtacagtta 
tgttttggat 
tgatagtatt 
aattgcttta 
tttgaaaaaa 
gaagacatta 
tgaagcattt 
tagttatcaa 
taaagattct 
aggtgctgat 
caaaaagcta 
aattgttatt 
atttataagt 
aaaggctggt 
cacagcttta 
atttacaatt 
aagtgttaaa 
taaaaacatc 
tcaaatcaat 
atgcaacttt 
gttcgcgatt 
tacttgggag 
taagttcttc 
gattcttaaa 
aatacttggt 
ggacgtaata 
tatttttgga 
atctaatact 
tggcgtcaaa 
aagaaattgg 
ccgtttatgg 
tatagataag 
taaattaatc 
gttacacact 
tgcacgtgac 
tagaaatgaa 
taccgcttat 
aaacggaacg 
atcatcggca 
taaagttggc 
tgaagctttt 
aacaaaagct 



aatcaatagt 
agttggtaat 
tctattttat 
ttcaattagt 
acttcatttc 
aaaatagcac 
tttacgataa 
tgaatgcact 
gaaattttat 
aaatatggac 
aaaaatgcca 
agaaaaagca 
ttttgaactt 
ggtttggatt 
aaaaccttaa 
actgatagtt 
aacgttgatg 
gcagaagctc 
gaaagagaat 
gctcaaagaa 
cctaaattaa 
gtaactgcac 
aaaggtttag 
cagaactcat 
gttttaggag 
gagtcattct 
attacccgtg 
atggtagcaa 
actaaatacg 
ttctctcaat 
gctatatcaa 
gcagaaattg 
ggtgcaaagg 
gaatttttaa 
gaaagtggct 
gtatgggctt 
tctatagcgg 
ttcagtggta 
acaattggca 
ggattgatta 
actggtccaa 
gcttataaga 
caaacatcta 
tttaaacaag 
ggattcttta 
attaaagcga 
tggcaagtga 
aacataaaag 
tcaagtttat 
ggagcagttc 
gttgttaggt 
agaagtatat 
tttttattta 
tggagcagta 
tcaaaattta 
atgtcaaata 
agtaaggtac 
attaaaagtc 
gacggtttaa 
ggtacagagc 
acattcgcta 
atgattgaat . 
tCacctaaag — - 

cttccaagat " 
tttaactgga 
gatgttttag 
ggaattgatt 
gcatggtcta 
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11761 agattaagaa aagtgctact gattggataa aagaaaattt agaagctatg ggcggtggcg 
11821 atttagtcgg cggaatatta gaccctgaca aaattaatta tcattatgga cgtaccgcag 
11881 cttataccgc tgcaactgga agaccatttc atgaaggtgt cgattttcca tttgtatatc 
11941 aagaagttag aacgccgatg ggtggcagac ttacaagaat gccatttatg tctggtggtt 
12001 atggtaatta tgtaaaaatt actagtggcg ttatcgatat gctattcgcg catttgaaaa 
12061 actttagcaa atcaccacct agtggcacga tggtaaagcc cggtgatgtt gttggtttaa 
12121 ctggtaatac cggatttagt acaggaccac atttacattt tgaaatgagg agaaatggac 
12181 gacattttga ccctgaacca tatttaagga atgctaagaa aaaaggaaga ttatcaatag 
12241 gtggtggcgg tgctacttct ggaagcggcg caacttatgc cagtcgagta atccgacaag 
12301 cgcaaagtat tttaggtggt cgttataaag gtaaatggat tcatgaccaa atgatgcgcg 
12361 ttgcaaaacg tgaaagtaac taccagtcaa atgcagtgaa taactgggat ataaatgctc 
12421 aaagaggaga cccatcaaga ggattattcc aaatcatcgg ctcaactttt agagcaaacg 
12481 ctaaacgtgg atatactaac tttaataatc cagtacatca aggtatctca gcaatgcagt 
12541 acattgttag acgatatggt tggggtggtt ttaaacgtgc tggtgattac gcatatgcta 
12601 caggtggaaa agtttttgat ggttggtata acttaggtga agacggtcat ccagaatgga 
12661 ttattccaac agatccagct cgtagaaatg atgcaatgaa gattttgcat tatgcagcag 
12721 cagaagtaag agggaaaaaa gcgagcaaaa ataagcgtcc tagccaatta tcagacttaa 
12781 acgggtttga tgatcctagc ttattattga aaatgattga acaacagcaa caacaaatag 
12841 ctttattact gaaaatagca caatctaacg atgtgattgc agataaagat tatcagccga 
12901 ttattgacga atacgctttt gataaaaagg tgaacgcgtc tatagaaaag cgagaaaggc 
12961 aagaatcaac aaaagtaaag tttagaaaag gaggaattgc tattcaatga tagacactat 
13021 taaagtgaac aacaaaacaa ttccttggtt gtatgtcgaa agagggtttg aaataccctc 
13081 ttttaattat gttttaaaaa cagaaaatgt agatggacgt tcggggtcta tatataaagg 
13141 gcgtaggctt gaatcttata gttttgatat acctttggtg gtacgtaatg actatttatc 
13201 tcacaacggc attaaaacac atgatgacgt cttgaatgaa ttagcaaagt tttttaacta 
13261 cgaggaacaa gttaaattac aattcaaatc taaagattgg tactggaacg cttatttcga 
13321 aggaccaata aagctgcaca aagaatttac aatacctgtt aagttcacta tcaaagtagt 
13381 actaacagac ccttacaaat attcagtaac aggaaataaa aatactgcga tttcagacca 
13441 agtttcagtt gtaaatagtg ggactgctga cactccttta attgttgaag cccgagcaat 
13501 taaaccatct agttacttta tgactactaa aaatgatgaa gattatttta tggttggtga 
13561 tgatgaggta accaaagaag ttaaggatta catgcctcct gtttatcata gtgagtttcg 
13621 tgatttcaaa ggttggacta agatgattac tgaagatatt ccaagtaatg acttaggtgg 
13681 taaggtcggc ggtgactttg tgatatccaa tcttggcgaa ggatataaag caactaattt 
13741 tcctgatgca aaaggttggg ttggtgctgg cacgaaacga gggctcccta aagcgatgac 
13801 agattttcaa attacctata aatgtattgt tgaacaaaaa ggtaaaggtg ccggaagaac 
13861 agcacaacat atttatgata gtgatggtaa gttacttgct tctattggtt atgaaaataa 
13921 atatcatgat agaaaaatag gacatattgt tgttacgttg tataaccaaa aaggagaccc 
13981 caaaaagata tacgactatc agaataaacc gataatgtat aacttggaca gaatcgttgt 
14041 ttatatgcgg ctcagaagag taggtaataa attttctatt aaaacttgga aatttgatca 
14101 cattaaagac ccagatagac gtaaacctat tgatatggat gagaaagagt ggatagatgg 
14161 cggtaagttt tatcagcgtc cagcttctat catagctgtc tatagtgcga agtataacgg 
14221 ttataagtgg atggagatga atgggttagg ttcattcaat acggagattc taccgaaacc 
14281 gaaaggcgca agggatgtca ttatacaaaa aggtgattta gtaaaaatag atatgcaagc 
14341 aaaaagtgtt gtcatcaatg aggaaccaat gttgagcgag aaatcgtttg gaagtaatta 
14401 tttcaatgtt gattctgggt acagtgaatt aatcatacaa cctgaaaacg tctttgatac 
14461 gacggttaaa tggcaagata gatatttata gaaaggagat gagagtgtga tacatgtttt 
14521 agattttaac gacaagatta tagatttcct ttctactgat gacccttcct tagttagagc 
14581 gattcataaa cgtaatgtta atgacaattc agaaatgctt gaactgctca tatcatcaga 
14641 aagagctgaa aagttccgtg aacgacatcg tgctattata agggattcaa acaaacaatg 
14701 gcgtgaattt attattaact gggttcaaga tacgatggac ggctacacag agatagaatg 
14761 tatagcgtct tatcttgctg atataacaac agctaaaccg tatgcaccag gcaaatttga 
14821 gaaaaagaca acttcagaag cactgaaaga tgtgttgagc gatacaggtt gggaagtttc 
14881 tgaacaaacc gaatacgatg gcttacgtac tacgtcatgg acttcttatc aaactagata 
14941 tgaagtttta aagcaattat gtacaaccta taaaatggtt ttagattttt atattgagct 
15001 tagctctaat accgtcaaag gtagatatgt agtactcaaa aagaaaaaca gcttattcaa 
15061 aggtaaagaa attgaatatg gtaaagatct agtcgggtta actaggaaga ttgatatgtc 
15121 agaaatcaaa acagcattaa ttgctgtggg acctgaaaat gacaaaggga agcgtttaga 
15181 gctagttgtg acagatgacg aagcgcaaag tcaattcaac ctacctatgc gctatatttg 
15241 ggggatatat gaaccacaat cagatgatca aaatatgaat gaaacacgat taagttcttt 
15301 agccaaaaca gagttaaata aacgtaagtc ggcagttatg tcatatgaga ttacttctac 
15361 tgatttggaa gttacgtatc cgcacgagat tatatcaatt ggcgatacag tcagagtaaa 
15421 acatagagat tttaacccgc cattgtatgt agaggcagaa gttattgctg aagaatataa . 
15481 cataatttca gaaaatagca catatacatt cggtcaacct aaagagttca aagaatcaga 
15541 attacgagaa gagcttaaca agcgattgaa cataatacat caaaagttaa acgataatat 
15601 tagcaatatc aacactatag ttaaagatgt tgtagatggt gaattagaat actttgaacg 
15661 caaaatacac aaaagtgata caccgccaga aaatccagtc aatgatatgc ttcggtatga 
15721 tacaagtaac cctgatgttg ctgtcttgcg tagatattgg aatggtcgat ggattgaagc 
15781 aacaccaaat gatgttgaaa aatcaggtgg tataacaaga gagaaagcgc cattcagtga 
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attaaacaat atttttatta atttatctat 



agaattactg 
cttagacgct 
cgaaactgca 
gaaattacaa 
taaattatta 
aacaaaattt 
taaatcagct 
aacatcggac 
tgagagaacg 
cggattggaa 
tgagattaaa 
cattgatgct 
ttcggaagaa 
aaacgcagaa 
ggtcaaagaa 
acaaaatggt 
tacactttca 
atatgatgat 
tgctgataaa 
agataaagta 
tatcaatgtt 
gaatgattct 
agacgatatt 
cggttcactt 
cggtggttca 
tggtataaca 
tgttctggag 
tccaaacaca 
taatgcttat 
tgcgggtatc 
atatgcaaca 
acgacgtgat 
agatgatgca 
agctaatttg 
caagttatct 
tattcttaac 
agagctgaga 
tttgattgct 
aggagaaatt 
agaacaacaa 
gattacaagc 
cacaagaaaa 
ctgaggaaga 
atggcaaaag 
ggtacagaac 
aaccatgctc 
ttgttatatc 
agatcagatt 
cagatactca 
aaacacgaac 
acactcaacg 
aaaaccttag 
gataagaaca 
ccgctaatta 
tcggattaaa 
agttaagagt 
gatgcaaaag 
gcgaacaaag 
actgtagtcg 
gcaaatcaaa 
gcgccaatta 
gtggttgata 
ggaaacaatt 
tctttatgtt 
ataataaagc 
cgcaaaagtt 
aaattgttga 



aatagcgagt 
gtgattgatg 
acgattggtc 
gatgtttata 
cagtcacaat 
ggtttaacgg 
attgaagcag 
tataaaacag 
actttaaaag 
gaacaaaaac 
gcaagtattg 
caagatgatc 
gagcaacgcg 
ctaaaggcta 
agcacagatg 
aaggaaatca 
aatatattaa 
aacggagtgg 
attgatatta 
gataaaaccg 
aatagaattg 
attgaactag 
tttacgcgac 
tatatgtcac 
tctggtacga 
atcaattcct 
tcttacgctt 
gacaaagtgc 
tcgagtgacg 
aggttttcta 
ggtggagata 
ggtaataggt 
ggagatagga 
catattactt 
atcgaaaatc 
ttacctatta 
gaagatagaa 
gaagaggtgg 
gaaggtatag 
ctaagaatca 
taatcctgaa 
cgcgatgtta 
gtaatcctta 
aaattatcaa 
gtgtagtata 
aagattttaa 
aattaactaa 
tatctccaga 
tagtctttat 
atgaatggcg 
aaattaaatt 
atgctattca 
tacgtgatat 
tagcattatt 
ttttggagct 
cagtgcttcg 
taataacaag 
gtattagccc 
ctttatatac 
aattaaagaa 
aagaagtaat 
tatgttaatg 
caacccagat 
agcgacaggc 
aaagattgaa 
ggatattgtc 
gagcgcaaat 



acttagtaga 
cttataatca 
ggttggtaga 
cagatgtaga 
acactgatga 
tgaatgaaga 
ctagagaatc 
acaaagacgg 
gtgaaatcaa 
aatatactga 
aacaagcaaa 
ttaaagagaa 
ctatacaaga 
gaaacgctga 
cacagaggaa 
aattaagaac 
acgagattgt 
ctcaagcttt 
acggtaatag 
atattgtcaa 
gaattaaagg 
gtggtattgt 
tgaaagacgg 
atttcggtac 
ttcaatggtg 
atggtggtgt 
catcgaatat 
ctggattaaa 
gttatattat 
aagaaagaaa 
caacaatcga 
atattcatat 
tagcttctaa 
ctgctggcac 
aatataacga 
gaacgtggtt 
aattatcgga 
agaatttagg 
cgtatgatcg 
agaaattgga 
tatacaattc 
aaagcgtata 
gcactatttt 
caatacagaa 
tcaagatttc 
atctgaagaa 
caaaaaacaa 
ggtaacagtt 
tcttttagaa 
catcagaagg 
aggtcaaaaa 
aaaagaaaga 
gaaaatgtgg 
gcgtatgctt 
tcgctgtgga 
gcactggcct 
atacatcgta 
aattccagta 
aacgtataaa 
atataaagct 
gacacctacg 
acaaaaaatc 
ggttggtatg 
gaaaggctgc 
aaatatggtc 
gttttcccgt 
ttaaatactt 
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acaacacgct 
taatgatttg 
aattaaaaat 
tacacaagct 
agatgtcaaa 
aaaatataaa 
tttgcagtta 
cacaaaagaa 
tattgttgaa 
agataaagtt 
tgaccagtta 
tcaagaagcg 
ggaatcgcaa 
tgctcaagct 
aaagaaagct 
aacattgact 
tactaaagaa 
tcaaaatgtt 
gaatgtgggg 
agaaataaac 
cagtcttaat 
cggtgacaat 
gcaacgtact 
tcacctaaga 
ttcgacttat 
ggataaaact 
cgttgcacta 
caaaagcaaa 
ccgatttgca 
gtttggttct 
taaaggtctt 
agcagggtat 
acagagtaca 
ctcaatttat 
aattgggcgt 
tagagatgaa 
tgataaagct 
agacacctat 
attaaaagag 
tctatggatt 
ggagtcaaag 
attatttatc 
tacaagaaaa 
tatacaaaaa 
aggtttattt 
acaggaagtt 
aacgctaaga 
cgcgtgaaag 
aacactgaaa 
agcgggtgta 
ttagaagaga 
acccaagagc 
gaaatagatg 
gtgcttggtt 
atgggcatat 
cgtgtttctg 
tttattttgg 
Ctgatcttag 
gacgatgaaa 
gacaatccaa 
gaaaataagt 
aatatgaacg 
aagcagaaaa 
gatttcagtg 
aaggtttata 
aaataattaa 
caaagtatgg 
tcacatcatt 



agtcttttgt 
aaagcggact 
aatttagaat 
ttatttcttg 
atcgccattt 
gaagcgttgg 
gtcggagaac 
caattacgtg 
cgtttagata 
acgttaaacg 
agtgatttgt 
caagaagctt 
gcgtatgctg 
aaacttgaag 
aatgcttata 
cgctatggtt 
gagtttaatg 
acagatggaa 
ccacgtggta 
cttcttatcc 
ttatcaagag 
aacagatatg 
tggagaggga 
tttagaaata 
attgatggtg 
tacagtgata 
acgtcagata 
caggcaccgg 
ttcacgctgt 
gatgagaact 
gttcaaattg 
ggcaaattta 
gacctactgt 
agacgtactt 
tcgacatcag 
caactggaac 
gagtctgaaa 
aaacttgata 
tttgtcacgt 
catcttatcc 
aatgcaggat 
acaggaaatt 
taaagaaaat 
tttaaggagg 
tagtacaaat 
ttacaacttc 
aaattgcgga 
tagttaaaga 
cagtatgaaa 
ctgaattggg 
atgataaaac 
aagttaacat 
aaaagaataa 
tagttgggac 
aagagaggtg 
gtttggtaag 
ataaaaggag 
cattagtaaa 
ctatatcatc 
catctcaaga 
atagaaaagc 
acacaaatga 
atggtttgac 
ttatgatcac 
tgcttataat 
aaactatgac 
tggcggagct 
tggtcaaaac 



cagaagctac 
tacaagcaag 
ctatgacacc 
agtatagaaa 
cagatagatt 
aaataatagc 
ctaatgctgt 
actatgtaaa 
ctgctgaagc 
aatatcgaaa 
ccaataatcc 
taaaatcata 
atggtaaaat 
aggcaaaaca 
cagacaacaa 
ctcaaattat 
caaccaatcg 
caacaatcag 
ttagattaaa 
aaaatatgcg 
agggtcttga 
ttcaaataca 
aacgttcaac 
acaccgctgg 
aaggtgaaga 
gtggcatgaa 
ataatcgggt 
tgtatttata 
ctaatgcaga 
atgattacgg 
ttaatggacg 
atatgctgaa 
ctgtaggttc 
attcggccgc 
cgcgtaaata 
attcaaaagc 
ttttagctag 
gatacgtagg 
atgatgacaa 
ctgttatcaa 
aacaaacaag 
atgaggttaa 
caacaatgtg 
ccatttaatt 
cgacaaagaa 
cgaaatggtt 
gacgttaaat 
agtagttgaa 
agctatgagt 
gtggttcaaa 
aatgctcagc 
taaattagat 
gaaagaaaat 
aatatttggg 
attaccatgt 
tgtaagtaat 
caaacaaatg 
tcaattctta 
aataatactt 
aggtaaatgg 
aacagggcaa 
tttagggtag 
aattcattag 
gccaatatgt 
atcccgtttg 
agctttttac 
ggacacgttg 
tggaacggta 
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19921 aaggttggac taatggcgtt gcgcaacctg 

19981 ttcattatta tgacaatcca atgtatctta 

20041 ttggcaataa agctaaaggt attattaagc 

20101 aacctaaaaa aattatgctc gtagccggtc 

20161 acggaacaaa cgaacgcgat tttatacgta 

20221 taagacatgc aggacatgaa gttgcattat 

20281 atcaagatac tgcatacggt gttaatgtag 

20341 ttaaatcaca ggggtatgac attgttctag 

20401 caagtggtgg gcatgttatt atctcaagtc 

20461 tacaagatgt tattaaaaat aacttaggac 

20521 tactaaatgt taatgtatca gcagaaataa 

20581 ttattactaa taaaaatgat atggattgga 

20641 taatagccgg tgcgattcat ggtaagccta 

20701 catcagctaa aaacaaaaaa aatccaccag 

20761 atgtccctta taaaaaagaa caaggcaatt 

20821 taagagacgg ttattcaact aattcaagaa 

20881 ttacgtatga cggtgcatat tgtattaatg 

20941 gtggacaacg tcgttatata gcgacaggag 

21001 gttttggtaa gtttagcacg atttagtatt 

21061 tatagggaat cttacagtta ttaaataact 

21121 tttttaacat ttctctcaag atttaaatgt 

21181 tattttttta tgttatagct agccttcggg 

21241 catcaactat ttacatctat ccttgttcac 

21301 gatagagagc atagttttca tactactccc 

21361 taacagttta cggggtgctt ttatgttata 

21421 tagccgggca gaggccatgt atctgactgt 

214 81 cactcgatac atatatctta acaacataga 

21541 tcgatacggt tatatttatt cccctacaac 

21601 attgtggtta ttttttgcgt ttttttgggg 

21661 caaacgcttg tggaaaagct aaaaggttaa 

21721 tttggacgct cgtgtacgtt agagaatgac 

21781 cttgtgttaa aaagccttta atatcagttg 

21841 aaaaaagggc agaaaaaggg cagatacctt 

21901 taactctctg tccattttct ctgttacatg 

21961 tgtatgtcct actcttttca taattgcttt 

22021 tatgtgtgta tgccttagtg tgtgagtagt 

22081 agctgaggac aatcgtttgt ttatcctact 

22141 gaatataaac cctctatcaa catagcttgg 

22201 cattattttt ttcaatacat ttgctatcct 

22261 tgcggtctta gtagtatctt tgtgaccaaa 

22321 gccattaata gcgatcgttt tatttttgag 

22381 ctcacctatg cgcatacctg ttaaagcttg 

22441 agctctatac tgcatgttat tatcgttcag 

22501 catctctaaa tagttataca ttttcgcttc 

22561 cttctttggt agtgtgacgc tatttaatat 

22621 ggcgtattta atagcttctt tcatatgtcc 

22681 tacgtttgat aatttgttaa taaatgcttg 

22741 taaattttga gaactgttct ttttgatgtt 

22801 cgttacttta aagccagatg tttttatatg 

22861 aaaagtcaaa gtttttaatt cgcttgacga 

22921 ttctaaacga aacattgcct ctttttgcga 

22981 tacacgtttc catttatctg tatacggatc 

23041 ttcattgttc ttatttttaa atttttcaaa 

23101 aaaaaataat aagggtaggc gggctaccca 

23161 aaaatacaga cgccacttat aattataaga 

23221 aatatatacg tgttttaaag gataaacctt 

23281 agggatctgc aatatattat tattaattct 

23341 tattactgga tttttaattt tttggggtaa 

23401 ctggaaagaa tttatgcaag cgtaactatt 

23461 tgatactatg ttattaatgt ttctgtcaat 

23521 atcagatata aattcaataa aataatcttt 

23581 ttttttatcg aaaacttctt ttaatatagc 

23641 aaacaatctt aaataatact cccacttcaa 

23701 ttctttagag gataagggaa taacatttac 

23761 catcactatt gcaaagtgtg aattagaaaa 

23821 aaaaactatt tctccttgtt taaactttgg 

23881 aaatctcttg agtaaatagt gaatacctga 

23941 agtttttaat ttattaatgc gtttttctat 
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gttggggtcc tgaaactgtg acaagacatg 
ttaggttaaa cttccctaac aacttaagcg 
aagcgactac aaaaaaagag gcagtaatta 
atggttataa cgatcctgga gcagtaggaa 
aatatataac gcctaatatc gctaagtatt 
acggtggctc aagtcaatca caagatatgt 
gcaataaaaa agattatggc ttatattggg 
aaatacattt agacgcagca ggagaaagcg 
aattcaatgc agatactatt gataaaagta 
aaataagagg tgtgacacct cgtaatgatt 
atataaatta tcgtttatct gaattaggtt 
ttaagaaaaa ctatgacttg tattctaaat 
taggtggttt ggtagctggt aatgttaaaa 
tgccagcagg ttatacactc gataagaata 
acacagtagc taatgttaaa ggtaataatg 
ttacaggggt attacccaac aacacaacaa 
gttatagatg gattacttat attgctaata 
aggtagacaa ggcaggtaat agaataagta 
tacttagaat aaaaattttg ctacattaat 
atttggatgg atgttaatat tcctatacac 
agataacagg caggtacttc ggtacctgcc 
ctagtttttt gttatgatgt gttacacatg 
ccaagcatgt cactggatgt tctttcttgc 
cgtagtatat atgactttag cattcccgta 
attgctttta tatagtagga gtgaactata 
tggtcccaca ggagacatct tccttgtcat 
aatgttacat tcgctataac cgtatcttaa 
caacaaaacc acagatccta ttaatttagg 
caaaaaaagg gcagattatt tgaaaaaggg 
aaatgacaaa aaccttgata caacagtgtt 
cggtttacca tcatacaagg gtgggattaa 
ttacaaagga tttgtagcgt ctttaaaaat 
ttagtacaca agtttttcta atttttgctc 
tgtatacacc tttatagtcg ttttttcatc 
taacgatata ttcatttccg ccaataaact 
aactttttta tttatattta atgattctgc 
gccttgcata ggatttcctt ggcaagttgt 
ttcccattgt tgcatctttt tattttctaa 
tgaattgatg gcgatttttc ttcttgaacc 
tccagcatta catttgattc tgtgaatagt 
gtcaacatct ttaacttgga gagctaataa 
aacttctaca gccccagcaa ctaaaatacg 
cataaaatcg cgtatctgta ttacctgttc 
ttctttttct atatcttcta tcgtcttact 
gtgttcgttt ggataactgt aaaatttaac 
aagttgacgc tttacctgat ttgcagaata 
catgtacttt gtatcaattt tgtttaaaag 
tttgattctt gttttcaaat tatcaagcgt 
atattcaagc cattcatcta ataacgcgtg 
cttgttgttt agtttttctt ttattttttc 
ttgctttgta ttcttattca agacaacact 
tttgcatttc tcgtagtatc tatacttcgt 
ccacatttta catccctcct caaaattggc 
tgaaaattgt ataaaaaaag acgcctgtat 
ttacatggtt aattaccaaa aacggtaacg 
taatatatta aaattatatc atcttatatc 
atttatcagt aacataatat ccgaagaatc 
aacttttctt atgcgaaact tactaatcgg 
accttttaat ttttttacct tatcaattgc 
tttatttaat ttattttcaa tttctaaact 
agtgatgaat tctgtgttgt ttttttggta 
tgaattattt tgcgcgctaa ttaaatttaa. 
atcaaaattc atctttaaat actttttgtt 
tatatcctcc gtattagaat catttttatt 
ttctttatta acgtttatac cgaaatctac 
ataaaaacct ttatggtctt tttcaccttc 
atctaacttt ttaaattttg gatttccaga 
atcatgcgtc atcatttctc ctttattctc 
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25741 
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25861 

25921 
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26041 

26101 

26161 

26221 

26281 

26341 

26401 

26461 

26521 

26581 

26641 

26701 

26761 

26821 

26881 

26941 

27001 

27061 

27121 

27181 

27241 

27301 

27361 

27421 

27481 

27541 

27601 
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27721 

27781 

27841 

27901 

27961 
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gctcacaccc 

cataatgaat 

tatttacgcg 

ccactagtta 

attataaact 

tactttaatt 

cttaaagtta 

cgctaaatat 

gaagcgactt 

catatatcta 

tctttaatag 

attgaatcac 

aaaaatactt 

gcaccacatg 

ttattctgtt 

gtgagttgag 

tcaggaactc 

gttttagata 

gataagtgac 

agaatatcta 

ttttgcattg 

caaaagttca 

ttatagttca 

ttttcgttat 

gctttaggtg 

acaacagaca 

tattttttta 

tgcaagaacg 

ttaggacaaa 

tacaacaaac 

gatgaaatta 

ttagacaagc 

aaagtgaatc 

cagatgtgag 

gcgataaccg 

ttacaaacat 

tattttgtag 

agaaatcatg 

agaaatatga 

caaagcaaaa 

gatgtcceac 

acattaaaag 

caaaacttac 

tcggtagctg 

aacggtgttg 

attaaaaaga 

ttggatatca 

ccaaaagtaa 

acatcttaaa 

aattatagca 

aattttcagt 

tttcgcaaaa 

aaagtatcag 

ccatattaga 

aagtgaatac 

ctatgaaaag 

atgttacaaa 

catgctagtt 

ttgttgaaaa 

tttgagatcc 

cttttgtaac 

catgtttttt 

gttgataaca 

ccgaacatcg 

ggtcgagaac 

atgcttaaat 

acagctcaag 

gaaatcgcaa 



tcaccaccat tcaacgtcta 
ctcctttggt taacttatcg 
cattatgtga cgataaatct 
aaacttcata tactatagtt 
ccttttaaac actgctgaaa 
ctttaatcca catatattta 
agattgcttt tttcatgtca 
acgttattaa tcacaataca 
tgatatcatc atacttcgga 
cacgcttgat aagacttact 
aatcttcttt cttaataaaa 
cattaactaa aatacaaaaa 



cttcafcgcaa 
caatatacga 
catctaattg 
aaaatatgtt 
gataagaatc 
ataagaataa 
tttttgacat 
cttgacgcaa 
gtaatgcctc 
acttttttaa 
aatgtttgaa 
tgaacggtaa 
tatcagaaag 
ttatcaaagc 
aacagaaagt 
agaaaaggtt 
ttgaagctta 
acctaagcat 
ctaaaaagct 
ttaacggtgt 
agtaacattc 
cgagagctgg 
tctgctgaat 
ttaattttaa 
gaaaagatat 
Ctgatagcga 
tcattatcaa 
acgaaaaaat 
cagctattcg 
atccagacta 
ttttacaaca 
gtagtgataa 
atataggaca 
gtggagaaag 
aaaaacgaat 
caggcaaagg 
aggaggaaca 
aaagaagtta 
aaagtaagaa 
gatttgtcgc 
catggcttcg 
aaattaacat 
aacctagcag 
agagtttcag 
aatttagaat 
actgtttaga 
aggttagcta 
aacaaacaca 
agtttcaatt 
tgaaccatcc 
acattataca 
gagaaatgtt 
ttcatcaagc 
acggatttga 
gcaatatgac 
tgattcaacg 



tatgtcatca 

tactagttta 

ctcatttgca 

attgattttt 

tacatcatac 

tttatgttgg 

tttaatattc 

gttcctatct 

cttgaaattc 

ctttttgtgt 

cttaggaggt 

gatagtcgaa 

aactttgtct 

ttgtaagtta 

tcaaacgttt 

aataaaagta 

cgacaaaacg 

gaacaacgaa 

acgaagtgct 

tgaaagcgat 

acttcttaat 

cgatgatatg 

gtgggtgttg 

agagctacca 

tgctgagatt 

ggacaagctg 

cgaatcagga 

tagagaaacc 

caaacacggt 

catcattaca 

gcaagtagaa 

ttcaatactt 

aaacagattg 

ttataactta 

aattaataat 

acaacaatac 

caatggaaca 

gagaggctat 

tcaataatga 

taggaagatt 

aatcaattca 

tatcaatttt 

caaaagttta 

aattaactat 

tgcgaaagaa 

aagaaacaac 

aattcaacgg 

taagttttag 

ccttcatatc 

tttaaagtaa 

cgaaaggagc 

caatattcaa 

actagaagtt 

agaaaataca 

tcactatatt 

tagtgaacct 



cacttgcagg 

ccatctattt 

ttaggtaact 

tcctttttta 

tagacgtctt 

aaagtgaggt 

atttctcctt 

actttgccca 

tttagagata 

ccatctaata 

gcgtatgttc 

tcagcatttg 

tataattctt 

gactctttat 

tagttaagta 

gacattatcg 

cccataagcc 

tctggagaag 

aattctcttt 

ttcataattt 

attatatagg 

tgacactgtt 

gattatttga 

gtgtactcga 

ttgaagttga 

ttgggaatac 

gaacttaata 

acacatcttc 

cttaaagaaa 

gacgcattcg 

atcaaagagt 

aacaaagtta 

ataaccacgc 

agccgcgttt 

aggaaaaagg 

gtaagaacag 

ttaggatatg 

acgcaccaat 

ttatacagtc 

gctagaaaat 

atatacgcaa 

gtgttgactg 

gtcaacaaac 

gttggagaac 

ttcaaatggt 

ccaactcaaa 

ccagatggtt 

tttgttaata 

aatcacatta 

aaatggcaag 

cgatttagaa 

gaggaagctc 

tcaaaaagct 

tggagtgaca 

tcgagaaatc 

cgatgacttc 

aaaaataaat 

aaccctgaac 

taaggatttg 

tagggtctag 

ctggaaaaac 

ctagaagttt 

ataaacaaca 

gaaaaagaaa 

aagacagcat 

gattacacag 

gaccacgcac 

ggcaaacgtg 



cgttttttga 
tttgtgaaat 
cataagtgaa 
ttttgcaatt 
tttcaaataa 
agtaggtaat 
tgtttatatt 
ttactttaat 
ccaaattaat 
caacgagtgc 
cttgttttaa 
atggcgtttc 
ctcctatgcc 
attcatctat 
cgttttcttg 
tttcatcttg 
acgcttcacc 
accttccatt 
gaaagggttt 
gttttaatct 
aagggaaata 
caaaattggg 
atactaatac 
cacaatttaa 
acaacaaagt 
ctataaaaga 
agtaaaggag 
aaatgaagca 
caaagtacac 
ctggtttgga 
tccaaaaagt 
ctgatttaac 
ttatcaacat 
aaatacattc 
aggatactca 
tagaaattga 
caagatcaaa 
ttagtgcatc 
taatcttcga 
tcaaacgctg 
cagacaatgt 
agtataagaa 
caaaagtatt 
tagcgaaaat 
taagaaataa 
agagtatgga 
caagtaaagt 
agtttttagg 
accaaagaag 
aaaccaatca 
gaaatcaata 
aatcatccga 
tatgtacaag 
cttaattcag 
aaaaactatt 
gaataaagga 
taaaactcaa 
tgttgcgagc 
ccctgcctcc 
aaaaaatgtt 
aattttcttt 
catacttatc 
tgcaagcatt 
atggagaaat 
ataaagattg 
ctatcgctca 
tcacactaga 
caagacaata 



ttagtaaaat 
aaattccaag 
tggttgatta 
agttattttc 
gcatgattaa 
aaatataaga 
tatattaaag 
atcactaaac 
atagtcttcg 
aattgtacca 
cataggttcc 
gtcttcttta 
agcaccagtt 
agaagtgact 
gcggggaggt 
acgttcttcg 
gacatttaaa 
aacatactgg 
cgacttttct 
ttcagaagtg 
aaaatcaata 
gttatagtta 
aacttttgat 
ctttgctata 
accatggaaa 
tgttcacaaa 
gcataacaca 
tcaaaacctt 
tcgagacctt 
aatggtagag 
agtgaaagcg 
agagtggcgg 
ccacattgag 
gatagtcatt 
aatgcaagca 
aaacgaacct 
caatgccatt 
aggtcaaaac 
tgcttctaaa 
ggtaacatca 
aattgaacaa 
agaaaaagag 
attcgctgac 
acttaaacaa 
tggacatctc 
tctaaaaatc 
atcacgtaca 
agaaaaacaa 
agttgaaaga 
gttcaggttc 
aaaaactcaa 
ttccgctaaa 
atgttcatga 
acttgagtga 
atttatacat 
ggaacaacaa 
attactcaag 
agttgcagag 
acacttagag 
tcgatttcct 
aaatccgaaa 
acctccttag. 
acaaacaaat 
cgcaatcagc 
gtttccaaga 
aaaaagagca 
cactgcaaaa 
tttcatccaa 
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28081 
28141 
28201 
28261 
28321 
28381 
28441 
28501 
28561 
28621 
28681 
28741 
28801 
28861 
28921 
28981 
29041 
29101 
29161 
29221 
29281 
29341 
29401 
29461 
29521 
29581 
29641 
29701 
29761 
29821 
29881 
29941 
30001 
30061 
30121 
30181 
30241 
30301 
30361 
30421 
30481 
30541 
30601 
30661 
30721 
30781 
30841 
30901 
30961 
31021 
31081 
31141 
31201 
31261 
31321 
31381 
31441 
31501 
31561 
31621 
31681 
31741 
31801 
31861 
31921 
31981 
32041 
32101 



gttgaaaaag 
aacacaatca 
gatgcagtag 
caaaacggta 
cttattaaac 
ttattcgaaa 
acgccaaaag 
caaacaactt 
acaatggcag 
attagtagca 
gctgaaaatt 
aaataacaac 
caccagaaaa 
cacaaatcca 
accgcaaaga 
tgattaatat 
aggatattaa 
cttagcgatt 
cgcaagtatc 
ctacttgttg 
aacgaaaaac 
ttcatgttaa 
tacaagttaa 
acttagatat 
cagacgaaca 
agactgtaac 
ctgataacaa 
gtatggaaga 
aaactattga 
gtaagcatac 
tataaatttg 
aaagacgctt 
gacgtagaaa 
ttacaggaga 
aaacttagag 
aatgattggg 
caagaagaat 
gatgatgaag 
aaagctatta 
aacggagaaa 
aagattagac 
acggacgtag 
aattatgaaa 
taacggctca 
caatgataga 
taaacataat 
attagttact 
tcttattggt 
gtattttacg 
acctattccg 
atcaatgtct 
agatttagcg 
cgacggtact 
actagaaaac 
tatagaacaa 
accagtagaa 
agaaatcagt 
agcgtttatg 
agataaagcg 
tcacgcagac 
ccactatgac 
tggcgttaag 
gaggctcaat 
atagcactcc 
gtggagaaaa 
aggcggacaa 
ttaatacttc 
ttaacattct 



catggaacag 

atcaattaga 

ctactactaa 

taaacatcgg 

gcaagggtgt 

ttaaagaaac 

taacaggtaa 

aataggagga 

ttgtgacgtg 

gggcgttgag 

ctactgaatc 

attatacacg 

cacatataga 

tcaattgttt 

taatttaggt 

ttctaaattg 

atgagcaaca 

gtacttatgc 

gcaacattca 

gagcaagtaa 

ggaggaagtc 

cggattcgat 

agatatgaac 

ggcatcagac 

ggacagacta 

ttatatcatt 

ttcagatatt 

agcgagtatc 

gtacgaggag 

tcaaaaaact 

cagtatacgg 

tcgtcattga 

tcgagaacta 

tgagagaaaa 

atatgacatt 

gagaagttgc 

acaaattcca 

gtagcactat 

cttctcaaag 

agaaagctag 

attcaccttc 

tagaagcaat 

atcacaggac 

gcagggtttc 

gaaaatagat 

caatttgtac 

cgattaggta 

aagttttgtc 

gatttttcat 

aagacagata 

caacaaagca 

ttttaaggtg 

tattccgtcg 

ggatatccac 

cgcaaaaaaa 

tcaactagaa 

ctgcgcgact 

tttcatcacc 

ttattatatt 

ctggcacatt 

aaacatgtat 

tcgtttgatg 

aaaatgttga 

taatcgtcat 

ttttaaaatc 

actaattgag 

aaattcaatg 

tttaacaaat 



cccagaaatg 
aacaaagatt 
gacatcaatt 
gcaacgcaga 
ggattataac 
atcaatcaca 
aggacaacaa 
attacaaatg 
gaaggtttgg 
tgactatcta 
tgctcgtcgc 
aaaggaaaga 
ggcgaagaaa 
ggagtatgta 
gtagaaaatt 
gaagagtatt 
tttataaaag 
cgtttctata 
tgtactacaa 
cagtatcaaa 
aagatgtatt 
tttaagctat 
aacgtaccaa 
ttattcaacc 
attaacttag 
cgtcataggg 
agttactcca 
aatatggatt 
gtagaacatg 
aaagataaat 
aaaaattggc 
cattaacgaa 
tcaacacttt 
cggacaagaa 
gaatgatgtg 
tgaacgaatt 
ctttgttatt 
caaccctact 
tgatgtgtta 
atatattcta 
aataacaatt 
tagaaatgga 
aagcgcaatt 
aagctggaga 
atttcacaat 
cgccgtataa 
ttaagttaaa 
acttggtatt 
ttattaaacc 
agcaaaaagc 
atccatttga 
tggtttaaat 
ttgctactgg 
taaaagcaga 
tattcgcaat 
aattattaca 
gttctatgaa 
aaatacctat 
gggctacaat 
atgaagcagt 
tagcgctatg 
ataaatacca 
aaggagagaa 
cttggcggaa 
tccgtttagt 
ccttttttga 
ccagaaagtt 
tctaatcccg 



attatgcaac 
gcacgtgaca 
ttagttggag 
ttgtttgagt 
atgcctacac 
cattcggacg 
tactttgcca 
aacgcaccat 
aagattgaga 
aacaacaaat 
cttttgaagt 
tagaaatgcc 
aacttgtgaa 
gaagtacagt 
tatacattga 
tgatcagaaa 
ctacctagta 
cttcactaca 
agaatgcttt 
cacttaagaa 
acgaaatagg 
tcattttaaa 
ttaaacatgc 
aagcaataga 
tcatgaaatg 
atacgccaat 
caaatagaaa 
atcacaaagc 
actgaggaaa 
aatatcgctg 
tcaggaaaaa 
ggtggaacaa 
gtttatgttg 
atcaatgttg 
atgaaaaata 
gtcagtatgt 
acaggtcatg 
atcactattg 
gctagggcaa 
aacgctgaac 
aacaataaga 
aactaaaaat 
tactaaagaa 
attcacagtg 
cgtatttgaa 
atatgatttc 
tcttcctagc 
gaaatggaaa 
ttacaaaaag 
tgaagaaaat 
aagcagtggc 
gcaatacatt 
tgttgaactt 
agtagaggtt 
gtgtagagat 
aacagaattg 
agttgcaagg 
gagtgtagaa 
caaccgcaac 
cggcagaggc 
tcgcgaacat 
cttgcatgac 
aaaggaatga 
gagattagaa 
taatacaggt 
tgtctattac 
tacttattgt 
aaacaaatct 



gtgctttaaa 
aaccaaaaat 
agttagcaaa 
ggttacgtca 
agtattcaat 
gtcacacatc 
acaagttttt 
acaaaacaac 
agcacactag 
ctttaaccat 
tcgccgaaca 
aaaaatcata 
aaagttatac 
atacaactgg 
ttattcacca 
gcataaaaaa 
gcagtattat 
gcatggtcaa 
ttcaaagaat 
aaaattcatg 
cgaaatcata 
aggtcatatg 
ttatgtcgta 
tgaatggatt 
gtaggaggtc 
ttatataact 
tagagctagg 
aatcaagaaa 
aacaagaacc 
agaaaaataa 
ccacgtttgc 
cggttactga 
taaatttttt 
tagttattga 
agtctaaaaa 
acagattaat 
aaggtatcaa 
aagcgcaaga 
tgattgaaga 
cttctaatac 
aatttgcaaa 
taattaaaag 
acaaatcaag 
aaagttaaaa 
aatgatgaag 
caagaaaaac 
ttagattttg 
ttcaatgaag 
ggcgatgatg 
aacggggcac 
caatttggat 
acaagatacc 
gaacaaagtc 
ccggacaata 
atagaacttc 
gaaactatga 
gagttaatag 
acgagtaagt 
tgtgtaatat 
atgaacagaa 
cacaacgagc 
tcgtggataa 
atagactaag 
atgctatgca 
ttttacaaaa 
ccaggggctg 
ttctaggttg 
ttgtttttct 



aattgctaac 
tgtatttgca 
gatcaccaaa 
aaacggattc 
ggaacgtgag 
aattagtaag 
aggagaaaaa 
cctcctcatc 
aaaacctgtg 
accgaaagat 
aactattagc 
gtaccaccaa 
gcaacaccta 
ttgaaatatt 
acaggcactc 
tggtattagg 
gcttcacagt 
ttgcgggatt 
aaaaaaactg 
ttcaatataa 
cgcaaaaata 
ggcatatcaa 
gatgagaatg 
gaagagaaca 
gctatgaagc 
aacaaaccaa 
gagtttaacg 
acagtgacag 
acaagaaaaa 
aaggaaattc 
tacaagagat 
cgaaggatca 
acctcaaatt 
aactattcaa 
accaacgttt 
aggaaaactt 
caaagataaa 
acaaattaaa 
atttgatgat 
gtttgaaaca 
tcctagcatt 
gacggtattt 
aaaagtttta 
atattgaact 
gcaaacaata 
aattgattga 
ataccaatga 
atgaaggtaa 
ttgttaacaa 
aacaacaaac 
atgacgacca 
agaaagataa 
acattgactt 
aaaaactatc 
actggggcga 
aaggttatga 
aactgattat 
tgctaagcga 
gcggaaagcc 
acaaaatgaa 
aacatgcgat 
aagttgatga . 
aataataaaa 
tgctgtaaaa 
gctttaccat 
taatgtaact 
tgtcctgact 
ataatcttat 
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32161 taaagtgatt taaaaactga ggagcataaa acttattata aattcctttt tttgttaagt 
32221 aagacatgtc aaaagtttca tttaaaaccc ctaaccttac taggttatta attgaaattt 
32281 cggttgattc tatatctaac ggagagtctt ttattaacgt gtccgatata ttcataccgt 
32341 cattctttgg gttcaaaacc gctctatatt taacggcagg atgtacttcg tgattcttta 
32401 aatgttttaa aagaatagca tcatttgggg ataattgttt aattatttca acaaatgaat 
32461 ggtgggttaa tgagtttttt ctgtcatcca tagatgatgc tattagtttt gcgaacatat 
32521 tacttaaagt tttttcacta atgtaaaact ttgaagcttc tagagcagga cctagaagag 
32581 aaaattgtgg ttcttgtaaa ttatttttag gtacagaaga tatttctttt ttaaattgtt 
32641 ctttgaattt ttcaaattct acttctcttt gataaataac tttatccaca taaaggtgga 
32701 atttcccaaa gacaagttcc caagtttcag agaatgtttc tacaggccct tttgatgcgc 
32761 cttcaataat tttatcaata cctttaccta aaataggatc cataattatt cacccccaat 
32821 ctaacgcaat agcgataata aaattatacc agaaaggaga atcaacatga ctgaccaacc 
32881 aagttactac tcaataatta cagcaaatgt cagatacgat aaccgactta ctgacagcga 
32941 aaagttactt tttgcagaaa taacatcttt aagtaacaaa tacggatact gcacagcaag 
33001 taatggttac tttgcaactt tatacaacgt tgttaaggaa actatatctc gtagaatttc 
33061 gaaccttacc aactttggtt atctaaaaat cgaaattatc aaagaaggta atgaagttaa 
33121 acaaaggaag atgtacccct tgacgcaaac gtcaatacct attgacgcaa aaatcaatac 
33181 ccctattgat aattctgtca atacccctat tgacgcaaat gtcaaagaga atattacaag 
33241 tattaataat acaagtaata acaatataaa tagaatagat atattgtcgg gcaacccgac 
33301 agcatcttct ataccctata aagaaattat cgattactta aacaaaaaag cgggcaagca 
33361 ttttaaacac aatacagcta aaacaaaaga ttttattaaa gcaagatgga atcaagattt 
33421 taggttggag gattttaaaa aggtgattga tatcaaaaca gctgagtggc taaacacgga 
33481 tagcgataaa taccttagac cagaaacact ttttggcagt aaatttgagg ggtacctcaa 
33541 tcaaaaaata caaccaactg gcacggatca attggaacgc atgaagtacg acgaaagtta 
33601 ttgggattag ggggatatta tgaaaccact attcagcgaa aagataaacg aaagcttgaa 
33661 aaaatatcaa cctactcatg tcgaaaaagg attgaaatgt gagagatgtg gaagtgaata 
33721 cgacttatat aagtttgctc ctactaaaaa acacccgaat ggttacgagt ataaagacgg 
33781 ttgcaaatgt gaaatctatg aggaatataa gcgaaacaag caacggaaga taaacaacat 
33841 attcaatcaa tcaaacgtta atccgtcttt aagagatgca acagtcaaaa actacaagcc 
33901 acaaaatgaa aaacaagtac acgctaaaca aacagcaata gagtacgtac aaggcttctc 
33961 tacaaaagaa ccaaaatcat taatattgca aggttcatac ggaactggta aaagccacct 
34021 agcatacgct atcgcaaaag cagtcaaagc taaagggcat acggttgctt ttatgcacat 
34081 accaatgttg atggatcgta tcaaagcgac atacaacaaa aatgcagtag agactacaga 
34141 cgagctagtc agattgctaa gtgatattga tttacttgta ctagatgata tgggtgtaga 
34201 aaacacagag cacactttaa ataaactttt cagcattgtt gataacagag taggtaaaaa 
34261 caacatcttt acaactaact ttagtgataa agaactaaat caaaatatga actggcaacg 
34321 tataaattcg agaatgaaaa aaagagcaag aaaagtaaga gtaatcggag acgatttcag 
34381 ggagcgagat gcatggtaac caaagaattt ttaaaaacta aacttgagtg ttcagatatg 
34441 tacgctcaga aactcataga tgaggcacag ggcgatgaaa ataggttgta cgacctattt 
34501 atccaaaaac ttgcagaacg tcatacacgc cccgctatcg tcgaatatta aggagtgtta 
34 561 aaaatgccga aagaaaaata t tact tat ac cgagaagatg gcacagaaga tattaaggtc 
34621 atcaagtata aagacaacgt aaatgaggtt tattcgctca caggagccca tttcagcgac 
34681 gaaaagaaaa ttatgactga tagtgaccta aaacgattca aaggcgctca cgggcttcta 
34741 tatgagcaag aattaggttt acaagcaacg atatttgata tttagaggtg gacgatgagt 
34801 aaatacaacg ctaagaaagt tgagtacaaa ggaattgtat ttgatagcaa agtagagtgt 
34 861 gaatattacc aatatttaga aagtaatatg aatggcacta attatgatca tatcgaaata 
34921 caaccgaaat tcgaattatt accaaaacta gataaacaac gaaagattga atatattgca 
34981 gacttcgcgt tatatctcga tggcaaactg attgaagtta tcgacattaa aggtatgcca 
35041 accgaagtag caaaacttaa agctaagatt ttcagacata aatacagaaa cataaaactc 
35101 aattggatat gtaaagcgcc taagtataca ggtaaaacat ggattacgta cgaggaatta 
35161 attaaagcaa gacgagaacg caaaagagaa atgaagtgat ctaatgcaac aacaagcata 
35221 tataaatgca acgattgata taaggatacc tacagaagtt gaatatcagc attttgatga 
35281 tgtggataaa gaaaaagaag cgctggcaga ttacttatat aacaatcctg acgaaatact 
35341 agagtatgac aatttaaaaa ttagaaacgt aaatgtagag gtggaataaa tgggcagtgt 
35401 tgtaatcatt aataataaac catataaatt taacaatttt gaaaaaagaa ataatggcaa 
35461 agcgtgggat aaatgctgga attgtttcta aacgtgttag aggttgttgg gagttttcag 
35521 aagctttaga cgcgccttat ggcatgcacc taaaagaata tagagaaatg aaacaaatgg 
35581 aaaagattaa acaagcgaga ctcgaacgtg aattggaaag agagcgaaag aaagaggctg 
35641 agctacgtaa gaagaagcca catttgttta atgtacctca aaaacattca cgtgatccgt 
35701 actggttcga tgtcacttat aaccaaatgt tcaagaaatg gagtgaagca taatgagcat 
35761 aatcagtaac agaaaagtag atatgaacaa aacgcaagac aacgttaagc aacctgcgca 
35821 ttacacatac ggcgacattg aaattataga ttttattgaa caagttacgg cacagtaccc. 
35881 accacaatta gcattcgcaa taggtaatgc aattaaatac ttgtctagag caccgttaaa 
35941 gaatggtcat gaggatttag caaaggcgaa gttttacgtc gatagagtat ttgacttgtg 
36001 ggagtgatga ccatgacaga tagcggacgt aaagaatact taaaacattt tttcggctct 
36061 aagagatatc tgtatcagga taacgaacga gtggcacata tccatgtagt aaatggcact 
36121 tattactttc acggtcatat cgtgccaggt tggcaaggtg tgaaaaagac atttgataca 
36181 gcggaagagc ttgaaacata tataaagcaa agtgatttgg aatatgagga acagaagcaa 
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36241 ctaactttat tttaaaaggg cggaaacaat gaaaatcaaa attgaaaaag aaatgaattt 

36301 acctgaactt atccaatggg cttgggataa ccccaagtta tcaggtaata aaagattcta 

36361 ttcaaatgat gttgagcgca actgttttgt gacttttcat gttgatagca tcttatgtaa 

36421 tgtgactgga tatgtatcaa ctaacgataa atttactgtt caagaggaga tataacaatg 

36481 aaaatcaaag ttaaaaaaga aatgagatta gatgaattaa ttaaatgggc gcgagaaaat 

36541 ccggatctat cacaaggaaa aatatttttt tcaacaggat ttagtgatgg attcgttcgt 

36601 tttcatccaa atacaaataa gtgttcgacg tcaagtttta ttccaattga tatccccttc 

36661 atagttgata ttgaaaaaga agtaacggaa gagactaagg ttgataggtt gattgaatta 

36721 ttcgagattc aagaaggaga ctataactct acactatatg agaacactag tataaaagaa 

36781 tgtttatatg gcagatgtgt gcctaccaaa gcattctaca tcttaaacga tgacctaact 

36841 atgacgttaa tctggaaaga tggggagttg ctagtatgat gttgaaattt aaagcttggg 

36901 ataaagataa aaaagttatg agtattattg acgaaatcga ttttaatagt gggtacattt 

36961 tgatttcaac aggttataaa agtttcaatg aagtaaaact attacaatac acaggattta 

37021 aagatgtgca cggtgtggag atttatgaag gggatattgt tcaagattgt tattcgagag 

37081 aagtaagttt tatcgagttt aaagaaggag ccttttatat aacttttagc aatgtaactg 

37141 aattactaag tgaaaatgac gatattattg aaattgttgg aaatattttt gaaaatgaga 

37201 tgctattgga ggttatgaga tgacgttcac cttatcagat gaacaatata aaaatctttg 

37261 tactaactct aacaagttat tagataaact tcacaaagca ttaaaagatc gtgaagagta 

37321 caagaagcaa cgagatgagc ttattgggga tatagcgaag ttacgagatt gtaacaaaga 

37381 tctagagaag aaagcaagcg catgggatag gtattgcaag agcgttgaaa aagatttaat 

37441 aaacgaattc ggtaacgatg atgaaagagt taaattcgga atggaattaa acaataaaat 

37501 ttttatggag gatgacacaa atgaataatc gcgaaaaaat cgaacagtcc gttattagtg 

37561 ctagtgcgta taacggtaat gacacagagg ggttgctaaa agagattgag gacgtgtata 

37621 agaaagcgca agcgtttgat gaaatacttg agggaatgac aaatgctatt caacattcag 

37681 ttaaagaagg tattgaactt gatgaagcag tagggattat ggcaggtcaa gttgtctata 

37741 aatatgagga ggaataggaa aatgactaac acattacaag taaaactatt atcaaaaaat 

37801 gctagaatgc ccgaacgaaa tcataagacg gatgcaggtt atgacatatt ctcagctgaa 

37861 actgtcgtac tcgaaccaca agaaaaagca gtgatcaaaa cagatgtagc tgtgagtata 

37921 ccagagggct atgtcggact attaactagt cgtagtggtg taagtagtaa aacgtattta 

37981 gtgattgaaa caggcaagat agacgcggga tatcatggca atttagggat taatatcaag 

38041 aatgatgaag aacgtgatgg aatacccttt ttatatgatg atatagacgc tgaattagaa 

38101 gatggattaa taagcatttt agatataaaa ggtaactatg tacaagatgg aagaggcata 

38161 agaagagttt accaaatcaa caaaggcgat aaactagctc aattggttat cgtgcctata 

38221 tggacaccgg aactaaagca agtggaggaa ttcgaaagtg tttcagaacg tggagcaaaa 

38281 ggcttcggaa gtagcggagt gtaaagacat cttagatcga gttaaggagg ttttggggaa 

38341 gtgacgcaat acttagtcac aacattcaaa gattcaacag gacgaccaca tgaacatatt 

38401 actgtggcta gagataatca gacgtttaca gttattgagg cagagagtaa agaagaagcg 

38461 aaagagaagt acgaggcaca agttaaaaga gatgcagtta ttaaagtggg tcagttgtat 

38521 gaaaatataa gggagtgtgg gaaatgacgg atgttaaaat taaaactatt tcaggtggag 

38581 tttattttgt aaaaacagct gaaccttttg aaaaatatgt tgaaagaatg acgagtttta 

38641 atggttatat ttacgcaagt actataatca agaaaccaac gtatattaaa acagatacga 

38701 ttgaatcaat cacacttatt gaggagcatg ggaaatgaat cagctgagaa ttttattaca 

38761 tgacggtagt agtttgatat tacatgaaga tgaattattt aacgaaatag tatttgtttt 

38821 ggacaatttt agaaatgatg atgactattt aacgatagaa aaagattatg gcagagaact 

38881 tgtattgaac aaaggttata tagttgggat caatgttgag gaggcagatg atgattaaca 

38941 tacctaaaat gaaattcccg aaaaagtaca ctgaaataat caaaaaatat aaaaataaag 

39001 cacctgaaga aaaggctaag attgaagatg attttattaa agaaattaaa gataaagaca 

39061 gtgaatttta cagtcctacg atggctaata tgaatgaata tgaattaagg gctatgttaa 

39121 gaatgatgcc tagtttaatt gatactggag atgacaatga tgattaaaaa acttaaaaat 

39181 atggatgggt tcgacatctt tattgttgga atactgtcat tattcggtat attcgcattg 

39241 ctacttgtta tcacattgcc tatctataca gtggctagtt accaacacaa agaattacat 

39301 caaggaacta ttacagataa atataacaag agacaagata aagaagacaa gttctatatt 

39361 gtattagaca acaaacaagt cattgaaaat tccgacttat tattcaaaaa gaaatttgat 

39421 agcgcagata tacaagctag gttaaaagta ggcgataagg tagaagttaa aacaatcggt 

39481 tatagaatac actttttaaa tttatatccg gtcttatacg aagtaaagaa ggtagataaa 

39541 caatgattaa acaaatacta agactattat tcttactagc aatgtatgag ttaggtaagt 

39601 atgtaactga gcaagtgtat attatgatga cggctaatga tgatgtagag gcgccgagtg 

39661 attacgtctt tcgagcggag gtgagtgaat aatgagaata tttatttatg atttgatcgt 

39721 tttgctgttt gctttcttaa tatccatata tattattgat gatggagtga taataaatgc 

39781 attaggaatt tttggtatgt ataaaattat agattccttt tcagaaaata ttataaagag 

39841 gtagataaaa atgaacgagc aaataatagg aagcatatat actttagcag gaggtgttgt 

39901 gctttattca gttaaagaga tttttaggta ttttacagat tctaacttac aacgtaaaaa. 

39961 aatcaattta gaacaaatat atccgatata tttagattgt tttaaaaagg ctaaaaagat 

40021 gattggagct tatattattc caacagaaca gcatgaattt ttagattttt ttgatattga 

40081 agtctttaat aatttagata agcaaagtaa aaaagcgtat gaaaatgtta ttggatttag 

40141 acaaatgatt aatttatcaa atagagttaa ggcaatggaa gattttaaga tgagtttcaa 

40201 caatgaattt agtacaaatc agattttttt taatccttct tttgttatgg aaacaattgc 

40261 tattataaat gaatatcaaa aagatatatc ttatttaaaa aatataatta ataaaatgaa 
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40321 tgaaaataga gcttataatc atattgatag ttttatcact tcagagtacc gacgaaaaat 

40381 aaacgattat aatctttatc ttgataaatt tgaagaacag tttagtcaaa agtttaaaat 

40441 aaacagaact tcgataaaag aaagaattat tattaattta aacaagagga gatttaaatg 

40501 atgtggatta ctatgactat tgtatttgct atattgctat tagtttgtat cagtattaat 

40561 agtgatcgtg caagagagat acaagcactt agatatatga atgattatct acttgatgaa 

40621 gtagttaaaa ctaaagggta caacgggtta gaagaataca ggattgaatt gaagcgaatg 

40681 aataacgata ttaaaaagta atttatatta tcggaggtat tgcattgaat gataaagatt 

40741 gagaaacacg atatcaaaaa gcttgaagaa tacattcagc acatcgataa ctatcgaaga 

40801 gagttgaaga tgcgagaata tgaattactt gaaagtcatg aaccagataa tgcgggagct 

40861 ggcaaaagta atttgccggg taacccgatt gaacgatgtg caataaagaa gtttagtgat 

40921 aacaggtaca atacattaag aaatatagtt aacggtgtag atagattgat aggtgaaagt 

40981 gatgaggata cgcttgagtt attaaggttt agatattggg attgtcctat tggttgttat 

41041 gaatgggaag atatagcaca ttactttggt acaagtaaga caagtatatt acgtagaagg 

41101 aatgcactga tcgataagtt agcaaagtat attggttatg tgtagcggac ttttacccta 

41161 tgtaagtccg cattaaaaca gtttattatg ttagtatcag attaatattc aaagttatta 

41221 aatgctaata cgacgcatga acaagaggcg catcactatg tgatgtgtct ttttatttat 

41281 gaggtatgaa catgttcaaa ctaattgtaa atacattact acacatcaag tatagatgag 

41341 tcttgatact acttaagcta tataaggtga aacattatga tgactaaaga cgaacgtata 

41401 cgattctata agtctaaaga atggcaaata acaagaaaaa gagtgctaga aagagataat 

41461 tatgaatgtc aacaatgtaa gagagacggc aagttaacga catatgacaa aagcaagcgt 

41521 aagtcgttgg atgtagatca tatattatcg ctagaacatc atccggagtt tgctcatgac 

41581 ttaaacaatt tagaaacact gtgtattaaa tgtcacaaca aaaaagaaaa gagatttata 

41641 aaaaaagaaa ataaatggaa agacgaaaaa tggtaaatac ccccgggtca aaaaaatcaa 

41701 aagcgatc 
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Name 


Position 


1 


77ORF005 


19572..21026 


2 


77ORF006 


3976..5196 


3 


77ORF007 


21871..23076 


4 


77ORF008 


2120.3307 


5 


77ORF009 


31946.32803 


6 


77ORF010 


26092..26889 


7 


77ORF011 


24441. .25208 


8 


77ORF012 


29788.30576 


9 


77ORF013 


33620..34399 


10 


77ORF014 


27760..28512 


11 


77ORF015 


3291. .4028 


12 


77ORF016 


32867..33610 


13 


77ORF017 


23269..23982 


14 


77ORF018 


31169..31840 


15 


77ORF019 


39851. .40501 


16 


77ORF020 


6926..7570 


17 


77ORF021 


37762..38304 


18 


77ORF022 


30605..31156 


19 


77ORF023 


26903..27346 


20 


77ORF024 


10700..11140 


21 


77ORF025 


9707..10147 


22 


77ORF026 


40729..41145 


23 


77ORF027 


6518..6925 


24 


77ORF028 


34795..35199 


25 


77ORF029 


6117..6521 


26 


77ORF030 


36478..36879 


27 


77ORF031 


39151.39546 


28 


77ORF032 


33892.-34266 


29 


77ORF033 


5758..6120 


30 


77ORF034 


7886..8236 


31 


77ORF035 


19258..19560 


32 


77ORF036 


36876.37223 


33 


77ORF037 


102..446 


34 


77ORF038 


34908.35219 


35 


77ORF039 


37220.37528 


36 


77ORF040 


41377..41676 


37 


77ORF041 


35454.35753 


38 


77ORF042 


5490..5774 


39 


77ORF043 


29304..29564 


40 


77ORF044 


18481..18768 


41 


77ORF045 


5216..5500 


42 


77ORF046 


25663..2S935 


43 


77ORF047 


11159..11425 


44 


77ORF048 


28776..29039 


45 


77ORF049 


36013.36255 


46 


77ORF050 


35753.36007 


47 


77ORF051 


38931.39167 



152 
Table 3 





Name 


Position 


48 


77ORF052 


1762..2013 


49 


77ORF053 


37521.37757 


50 


77ORF054 


22818..23060 


51 


77ORF055 


17546.. 17788 


52 


77ORF058 


18892..19122 


53 


77ORF059 


34564.34785 


54 


77ORF064 


29574..29795 


55 


77ORF065 


28528.-28746 


56 


77ORF066 


27494. .27703 


57 


77ORF069 


38341.38547 


58 


77ORF070 


36269.36475 


59 


77ORF071 


40498..40701 


60 


77ORF072 


38735.38938 


61 


77ORF073 


30945.31148 


62 


77ORF074 


38544.38738 


63 


77ORF075 


13673..13870 


64 


77ORF077 


25357..25605 


65 


77ORF079 


29089..29280 


66 


77ORF080 


35204.35389 


67 


77ORF085 


24060..24242 


68 


77ORF092 


39706.39876 


69 


77ORF094 


32226.32393 


70 


77ORF096 


13606.. 13773 


71 


77ORF098 


7092..7256 


72 


77ORF102 


29051..29212 


73 


77ORF104 


34393.34551 


74 


77ORF109 


18282.. 18434 


75 


770RF112 


39543.39692 


76 


770RF117 


27361. .27501 


77 


770RF118 


38390.38530 


78 


77ORF120 


36059.36199 


79 


770RF124 


33699.33833 


80 


770RF128 


14221. .14355 


81 


77ORF130 


15675..15806 


82 


770RF133 


8414.. 8542 


83 


77ORF140 


13113..13235 


84 


770RF147 


7029..7148 


85 


770RF149 


30668.30787 


86 


770RF151 


31837.31953 


87 


770RF155 


30278.30391 


88 


770RF157 


4044..4157 


89 


770RF167 


20692..20799 


90 


770RF175 


35717.35821 


91 


770RF176 


6836..6940 


92 


770RF178 


- 35390.35491 


93 


770RF179 


8318..8419 


94 


770RF182 


29268..29564 
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Table 4 

77ORF017 sequence 

23982 atgacgcataatatagaaaaacgcattaataaattaaaaacttct 

1 MTHNI EKRINKLKTS 

23937 ggaaatccaaaatttaaaaagttagattcagatattcactattta 

16 GNPKFKKLDSDIHYL 

23892 ctcaagagatttgaaggtgaaaaaaaccataaaggtttttatcca 

31 LKRFEGEKNHKGFYP 

23847 aagtttaaacaaggagaaatagtttttgtagatttcggtataaac 

46 KFKQGEIVFVDFGIN 

23802 gttaataaagaattttctaattcacactttgcaatagtgatgaat 

61 VNKEFSNSHFAIVMN 

23757 aaaaatgattctaatacggaggatatagtaaatgttattccctta 

76 KNDSNTEDIVNVI PL 

23712 tcctctaaagaaaacaaaaagtatttaaagatgaattttgatttg 

91 SSKENKKYLKMNFDL 

23667 aaatgggagtattatttaagattgtttttaaatttaattagcgcg 

106 KWE YYLRLFLNLI SA 

23622 caaaataattcagctatattaaaagaagttttcgataaaaaatac 

121 QNNSAILKEVFDKKY 

23577 caaaaaaacaacacagaattcatcactaaagattattttattgaa 

136 QKNNTEFITKDYFIE 

23532 tttatatctgatagtttagaaattgaaaataaattaaataaaatt 

151 FISDSLEIENKLNKI 

23487 gacagaaacattaataacatagtatcagcaattgataaggtaaaa 

166 DRNINNIVSAIDKVK 

23442 aaattaaaaggtaatagttacgcttgcataaattctttccagccg 

181 KLKGNSYACINSFQP 

233 97 attagtaagtttcgcataagaaaagttttaccccaaaaaattaaa 

196 ISKFRIRKVLPQKIK 

23352 aatccagtaatagattcttcggatattatgttactgataaataga 

211 NPVIDSSDIMLLINR 

23307 attaataataatatattgcagatccctgatataagatga 23269 

226 INNNILQIPDIR* 
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Physico-chemical parameters of ORF 77ORF017 

1 MTHNIEKRIN KLKTSGNPKF KKLDSDIHYL LKRFEGEKNH KGFYPKFKQG EIVFVDFGIN 

61 VNKEFSNSHF AIVMNKNDSN TEDIVNVIPL SSKENKKYLK MNFDLKWEYY LRLFLNLISA 

121 QNNSAILKEV FDKKYQKNNT EFITKDYFIE FISDSLEIEN KLNKIDRNIN NIVSAIDKVK 

181 KLKGNSYACI NSFQPISKFR IRKVLPQKIK NPVIDSSDIM LLINRINNNI LQIPDIR 



Number of amino acids: 237 

Average molecular weight (Dal tons): 27887.38 

Mean amino acid weight (Daltons): 1 17.67 

Monoisotopic molecular weight (Daltons): 27869.83 

Mean amino acid monoisotopic weight (Daltons): 117.59 

Amino acid composition 



Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Ala 


A 


5 


2.11% 


7.58% 


Cys 


C 


1 


0.42% 


1.66% 


Asp 


D 


14 


5.91% 


5.28% 


Glu 


E 


13 


5.49% 


6.37% 


Phe 


F 


16 


6.75% 


4.09% 


Gly 


G 


6 


2.53% 


6.84% 


His 


H 


4 


1.69% 


2.24% 


lie 


I 


29 


12.24 
% 


5.81% 


Lys 


K 


33 


13.92 
% 


5.95% 


Leu 


L 


19 


8.02% 


9.42% 


Met 


M 


4 


1.69% 


2.37% 


Asn 


N 


30 


12.66 
% 


4.45% 


Pro 


P 


7 


2.95% 


4.9% 


Gin 


Q 


6 


2.53% 


3.97% 


Arg 


R 


8 


3.38% 


5.16% 


Ser 


s 


17 


7.17% 


7.12% 


Thr 


T 


5 


2.11% 


5.67% 


Val 


V 


11 


4.64% 


6.58% 


Trp 


W 


1 


0.42% 


1.23% 


Tyr 


Y 


8 


3.38% 


3.18% 



Number of acidic (negative) amino acids (ED): 


27 




11.39% 


Number of basic (positive) amino acids (KR): 


41 




17.30% 


Total charge (KRED): 


68 




28.69% 


Net charge (KR - ED): 


14 




5.91% 


Theoritical pi: 


10.01 


Total linear charge density: 


0.30 


Average hydrophobicity: 


-5.37 


Ratio of hydrophilicity to hydrophobicity: 


1.41 


Percentage of hydrophilic amino acid: 


57.81% - 


Percentage of hydrophobic amino acid: 


42.19%" 


Ratio of %hydrophilic to %hydrophobic: 


1.37 
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77ORF019 sequence 

39851 atgaacgagcaaataataggaagcatatatactttagcaggaggt 

1 MNEQI I G S IYTLAGG 

39896 gttgtgctttattcagttaaagagatttttaggtattttacagat 

16 VVLYSVKEIFRYFTD 

39941 tctaacttacaacgtaaaaaaatcaatttagaacaaatatatccg 

31 SNLQRKKINLEQIYP 

39986 atatatttagattgttttaaaaaggctaaaaagatgattggagct 

46 IYLDCFKKAKKMIGA 

40031 tatattattccaacagaacagcatgaatttttagatttttttgat 

61 YIIPTEQHEFLDFFD 

40076 attgaagtctttaataatttagataagcaaagtaaaaaagcgtat 

76 IEVFNNLDKQSKKAY 

40121 gaaaatgttattggatttagacaaatgattaatttatcaaataga 

91 ENVIGFRQMINLSNR 

40166 gttaaggcaatggaagattttaagatgagtttcaacaatgaattt 

106 VKAMEDFKMSFNNEF 

40211 agtacaaatcagattttttttaatccttcttttgttatggaaaca 

121 STNQIFFNPSFVMET 

40256 attgctattataaatgaatatcaaaaagatatatcttatttaaaa 

136 IAIINEYQKDISYLK 

40301 aatataattaataaaatgaatgaaaatagagcttataatcatatt 

151 NI INKMNENRAYN H I 

40346 gatagttttatcacttcagagtaccgacgaaaaataaacgattat 

166 DSFITSEYRRKINDY 

40391 aatctttatcttgataaatttgaagaacagtttagtcaaaagttt 

181 NLYLDKFEEQFSQKF 

40436 aaaataaacagaacttcgataaaagaaagaattattattaattta 

196 KINRTSI KERI IINL 

40481 aacaagaggagatttaaatga 4 0501 

211 N K R R F K * 



\ 
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Physico-chemical parameters of ORF 77ORF019 

1 MNEQIIGSIY TLAGGWLYS VKEIFRYFTD SNLQRKKINL EQIYPIYLDC FKKAKKMIGA 

61 YIIPTEQHEF LDFFDIEVFN NLDKQSKKAY ENVIGFRQMI NLSNRVKAME DFKMSFNNEF 

121 STNQIFFNPS FVMETIAIIN EYQKDISYLK NIINKMNENR AYNHIDSFIT SEYRRKINDY 

181 NLYLDKFEEQ FSQKFKINRT SIKERIIINL NKRRFK 



Number of amino acids: 216 

Average molecular weight (Daltons): 26026.06 

Mean amino acid weight (Daltons): 120.49 

Monoisotopic molecular weight (Daltons): 26009.34 

Mean amino acid monoisotopic weight (Daltons): 120.41 

Amino acid composition 



Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Ala 


A 


7 


3.24% 


7.58% 


Cys 


C 


1 


0.46% 


1.66% 


Asp 


D 


10 


4.63% 


5.28% 


Glu 


E 


16 


7.41% 


6.37% 


Phe 


F 


19 


8.80% 


4.09% 


Gly 


G 


5 


2.31% 


6.84% 


His 


H 


2 


0.93% 


2.24% 


lie 


I 


28 


12.96 
% 


5.81% 


Lys 


K 


22 


10.19 
% 


5.95% 


Leu 


L 


12 


5.56% 


9.42% 


Met 


M 


7 


3.24% 


2.37% 


Asn 


N 


23 


10.65 
% 


4.45% 


Pro 


P 


3 


1.39% 


4.9% 


Gin 


Q 


10 


4.63% 


3.97% 


Arg 


R 


11 


5.09% 


5.16% 


Ser 


s 


13 


6.02% 


7.12% 


Thr 


T 


7 


3.24% 


5.67% 


Val 


V 


7 


3.24% 


6.58% 


Tip 


W 


0 


0.00% 


1.23% 


Tyr 


Y 


13 


6.02% 


3.18% 



Number of acidic (negative) amino acids (ED): 


26 




12.04% 


Number of basic (positive) amino acids (KR): 


33 




15.28% 


Total charge (KRED): 


59 ' 




27.31% 


Net charge (KR- ED): 


7 




3.24% 


Theoritical pi: 


9.52 


Total linear charge density: 


0.28 


Average hydrophobicity: 


-4.84 


Ratio of hydrophilicity to hydrophobicity: 


1.37 - 


Percentage of hydrophilic amino acid: 


54.17% 


Percentage of hydrophobic amino acid: 


45.83% 1 


Ratio of %hydrophi!ic to %hydrophobic: 


1.18 ~ 
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77ORF043 sequence 



29304 




atgtattacgaaataggcgaaatcatacgcaaaaatattcatgtt 


1 M 


Y 


YEIGEIIRKNIHV 


29349 




aacggattcgattttaagctattcattttaaaaggtcatatgggc 


16 N 


G 


FDFKLFI LKGHMG 


29394 




atatcaatacaagttaaagatatgaacaacgtaccaattaaacat 


31 I 


S 


IQVKDMNNVPIKH 


29439 




gcttatgtcgtagatgagaatgacttagatatggcatcagactta 


46 A 


Y 


V VDENDLDMASDL 


29484 




tttaaccaagcaatagatgaatggattgaagagaacacagacgaa 


61 F 


N 


QAIDEWIEENTDE 


29529 




caggacagactaattaacttagtcatgaaatggtag 29564 


76 Q 


D 


RLINLVMKW* 
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Physico-chemical parameters of ORF 77ORF043 

1 MYYEIGEIIR KNIHVNGFDF KLFILKGHMG ISIQVKDMNN VPIKHAYWD ENDLDMASDL 

61 FNQAIDEWIE EMTDEQDRLI NLVMKW 



Number of amino acids: 86 

Average molecular weight (Daltons): 10186.68 

Mean amino acid weight (Daltons): 1 18.45 

Monoisotopic molecular weight (Daltons): 10180.02 

Mean amino acid monoisotopic weight (Daltons): 1 1 8.37 

Amino acid composition 



Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Ala 


A 


3 


3.49% 


7.58% 


Cys 


C 


0 


0.00% 


1.66% 


Asp 


D 


9 


10.47 
% 


5.28% 


Glu 


E 


7 


8.14% 


6.37% 


Phe 


F 


4 


4.65% 


4.09% 


Gly 


G 


4 


4.65% 


6.84% 


His 


H 


3 


3.49% 


2.24% 


lie 


I 


11 


12.79 

% 


5.81% 


Lys 


K 


6 


6.98% 


5.95% 


Leu 


L 


6 


6.98% 


9.42% 


Met 


M 


5 


5.81% 


2.37% 


Asn 


N 


8 


9.30% 


4.45% 


Pro 


P 


1 


1.16% 


4.9% 


Gin 


Q 


3 


3.49% 


3.97% 


Arg 


R 


2 


2.33% 


5.16% 


Ser 


s 


2 


2.33% 


7.12% 


Thr 


T 


1 


1.16% 


5.67% 


Val 


V 


6 


6.98% 


6.58% 


Trp 


W 


2 


2.33% 


1.23% 


Tyr 


Y 


3 


3.49% 


3.18% 



Number of acidic (negative) amino acids (ED): 


16 




18.60% 


Number of basic (positive) amino acids (KR): 


8 




9.30% 


Total charge (KRED): 


24 




27.91% 


Net charge (KR - ED): 


-8 


9.30% 




Theoritical pi: 


4.38 


Total linear charge density: 


0.30 


Average hydrophobicity: 


-2.80 


Ratio of hydrophilicity to hydrophobicity: 


1.19 


Percentage of hydrophilic amino acid: 


48.84% 


Percentage of hydrophobic amino acid: 


51.16% 


Ratio of %hydrophilic to %hydrophobic: 


0.95 
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77ORF102 sequence 

29051 atgagcaacatttataaaagctacctagtagcagtattatgcttc 
1 MSNIYKSYLVAVLCF 
29096 acagtcttagcgattgtacttatgccgtttctatacttcactaca 
16 TVLAIVLMPFLYFTT 
29141 gcatggtcaattgcgggattcgcaagtatcgcaacattcatgtac 
31 AWS IAGFASIATFMY 
29186 tacaaagaatgctttttcaaagaataa 29212 

46 YKECFFKE* 
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Physico-chemical parameters of ORF 77ORF102 

1 MSNIYKSYLV AVLCFTVLAI VLMPFLYFTT AWSIAGFASI ATFMYYKECF FKE 



Number of amino acids: 53 

Average molecular weight (Dal tons): 6155.42 

Mean amino acid weight (Daltons): 1 16.14 

Monoisotopic molecular weight (Daltons) : 6 1 5 1 .07 

Mean amino acid monoisotopic weight (Daltons): 1 16.06 

Amino acid composition 



Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Ala 


A 


6 


11.32 

% 


7.58% 


Cys 


C 


2 


3.77 

% 


1.66% 


Asp 


D 


0 


A A AO/ 

0.00% 


5.28% 


ulu 


b 


2 


3.77 

% 


£L 1*70/ 

0.37% 


Phe 


F 


7 


13.21 

% 


4.09% 


Gly 


G 


1 


1.89 

% 


6.84% 


His 


H 


0 


0.00% 


2.24% 


He 


I 


4 


7.55 

% 


5.81% 


Lys 


K 


3 


5.66% 


5.95% 


Leu 


L 


5 


9.43 

% 


9.42% 


Met 


M 


3 


5.66% 


2.37% 


Asn 


N 


1 


1.89 

% 


4.45% 


Pro 


P 


1 


1.89% 


4.9% 


Gin 


Q 


0 


0.00 

% 


3.97% 


Arg 


R 


0 


0.00% 


5.16% 


Ser 


s 


4 


7.55 

% 


7.12% 


Thr 


T 


4 


7.55% 


5.67% 


Val 


V 


4 


7.55 

% 


6.58% 


Trp 


W 


1 


1.89% 


1.23% 


Tyr 


Y 


5 


9.43 

% 


3.18% 



Number of acidic (negative) amino acids (ED): 


2 




3.77% 


Number of basic (positive) amino acids (KR): 


3 




5.66% 


Total charge (KRED): 


5 




9.43% 


Net charge (KR - ED): 


1 




1.89% 


Tbeoritical pi: 


8.18 ... 


Total linear charge density: 


0.13 — 


Average hydrophobicity: 


10.81 


Ratio of hydrophilicity to hydrophobicity: 


0.40 


Percentage of hydrophilic amino acid: 


28.30% 


Percentage of hydrophobic amino acid: 


71.70% 
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Ratio of %hydrophilic to %hydrophobic: 0.39 
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77ORF104 sequence 



34393 




atggtaaccaaagaatttttaaaaactaaacttgagtgttcagat 


1 M 


V 


TKEFLKTKLECSD 


34438 




atgtacgctcagaaactcatagatgaggcacagggcgatgaaaat 


16 M 


Y 


AQKLIDEAQGDEN 


34483 




aggttgtacgacctatttatccaaaaacttgcagaacgtcataca 


31 R 


L 


YDLFIQKLAERHT 


34528 




cgccccgctatcgtcgaatattaa 34551 


46 R 


P 


A I V E Y * 
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Physico-chemical parameters of ORF 77ORF104 

1 MVTKEFLKTK LECSDMYAQK LIDEAQGDEN RLYDLFIQKL AERHTRPAIV EY 



Number of amino acids: 52 

Average molecular weight (Daltons): 6193.13 

Mean amino acid weight (Daltons): 1 19.10 

Monoisotopic molecular weight (Daltons): 6189.12 

Mean amino acid monoisotopic weight (Daltons): 119.02 

Amino acid composition 



Aci 
d 


Symbo 
1 


Numb 
er 


/o 


Average % 
in Swissprot 


Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 




A 


A 
f 


7.69 

% 


7 SR% 

/ .JO /a 




r 


1 


1 92% 


1 66% 




D 


4 


7.69 

% 


5.28% 


Glu 


E 


6 


11.54 

% 


6.37% 


Phe 


F 


2 


3.85 
% 


4.09% 


Gly 


G 


1 


1.92% 


6.84% 


His 


H 


1 


1.92 
% 


2.24% 


He 


I 


3 


5.77% 


5.81% 


Lys 


K 


5 


9.62 
% 


5.95% 


Leu 


L 


6 


11.54 

% 


9.42% 


Met 


M 


2 


3.85 
% 


2.37% 


Asn 


N 


1 


1.92% 


4.45% 


Pro 


P 


1 


1.92 
% 


4.9% 


Gin 


Q 


3 


5.77% 


3.97% 


Arg 


R 


3 


5.77 
% 


5.16% 


Ser 


s 


1 


1.92% 


7.12% 


Thr 


T 


3 


5.77 
% 


5.67% 


Val 


V 


2 


3.85% 


6.58% 


Trp 


W 


0 


0.00 
% 


1.23% 


Tyr 


Y 


3 


5.77% 


3.18% 



Number of acidic (negative) amino acids (ED): 


10 


19.23% 


Number of basic (positive) amino acids (KR): 


8 


15.38% 


Total charge (KRED): 


18 




34.62% 


Net charge (KR- ED): 


-2 


3.85% 




Theoritical pi: 


5.03 


Total linear charge density: 


0.38 


Average hydrophobicity: 


-5.81 " 


Ratio of hydrophilicity to hydrophobicity: 


1.47 


Percentage of hydrophilic amino acid: 


53.85% 


Percentage of hydrophobic amino acid: 


46.15% 
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Ratio of %hydrophilic to %hydrophobic: 1.17 
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770RF182 sequence 

29268 atgttcaatataaaacgaaaaacggaggaagtcaagatgtattac 

1 MFNI KRKTEEVKMYY 

2 9313 gaaataggcgaaatcatacgcaaaaatattcatgttaacggattc 
16 EIGEI IRKNIHVNGF 

29358 gattttaagctattcattttaaaaggtcatatgggcatatcaata 

31 DFKLFILKGHMGISI 

29403 caagttaaagatatgaacaacgtaccaattaaacatgcttatgtc 

46 QVKDMNNVPI KHAYV 

29448 gtagatgagaatgacttagatatggcatcagacttatttaaccaa 

61 VDENDLDMASDLFNQ 

29493 gcaatagatgaatggattgaagagaacacagacgaacaggacaga 

76 AIDEWIEENTDEQDR 

29538 ctaattaacttagtcatgaaatggtag 29564 

91 LINLVMKW* 



WO 00/32825 



PCT/IB99/02040 



166 

Physico-chemical parameters of ORF 770RF182 

1 MFNIKRKTEE VKMYYEIGEI IRKNIHVNGF DFKLFILKGH MGISIQVKDM NNVPIKHAYV 

61 VDENDLDMAS DLFNQAIDEW IEENTDEQDR LINLVMKW 



Number of amino acids: 98 

Average molecular weight (Daltons): 1 1691.50 

Mean amino acid weight (Daltons): 1 19.30 

Monoisotopic molecular weight (Daltons): 1 1683.84 

Mean amino acid monoisotopic weight (Daltons): 119.22 



Amino acid composition 



Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Ala 


A 


3 


3.06 

% 


7.58% 


Cys 


C 


0 


0.00% 


1.66% 


Asp 


D 


9 


9.18 

% 


5.28% 


Glu 


E 


9 


9.18% 


6.37% 


Phe 


F 


5 


5.10 
% 


4.09% 


Gly 


G 


4 


4.08% 


6.84% 


His 


H 


3 


3.06 
% 


2.24% 


He 


I 


12 


12.24 

% 


5.81% 


Lys 


K 


9 


9.18 

% 


5.95% 


Leu 


L 


6 


6.12% 


9.42% 


Met 


M 


6 


6.12 
% 


2.37% 


Asn 


N 


9 


9.18% 


4.45% 


Pro 


P 


1 


1.02 
% 


4.9% 


Gin 


Q 


3 


3.06% 


3.97% 


Arg 


R 


3 


3.06 
% 


5.16% 


Ser 


s 


2 


2.04% 


7.12% 


Thr 


T 


2 


2.04 

% 


5.67% 


Val 


V 


7 


7.14% 


6.58% 


Trp 


W 


2 


2.04 

% 


1.23% 


Tyr 


Y 


3 


3.06% 


3.18% 



Number of acidic (negative) amino acids (ED): 


18 




18.37% 


Number of basic (positive) amino acids (KR): 


12 




12.24% 


Total charge (KRED): 


30 




30.61% 


Net charge (KR - ED): 


-6 


6.12% 




Tbeoritical pi: 


4.76 ' 


Total linear charge density: 


0.33 


Average hydrophobicity: 


-3.89 


Ratio of hydrophilicity to hydrophobicity: 


1.28 
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Percentage of hydrophilic amino acid: 5 1 .02% 

Percentage of hydrophobic amino acid: 48.98% 

Ratio of %hydrophilic to %hydrophobic: 1.04 
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Table 5 



BLASTP 2.0.8 [ Jan-05- 1999] 

Query= sid 1 100017 | lan | 77ORF017 Phage 77 ORF | 23269-23982 | -3 
{237 letters) 



Database: nr 



393,678 sequences; 120,452,765 total letters 



Score 



Sequences producing significant alignments: 

gi |44 93986 | erab | CAB39045 . 1 1 (AL034 559) predicted using hexKxon; . 
gi j 730607 | sp | P23250 | RPI1_YEAST NEGATIVE RAS PROTEIN REGULATOR P. 
gi j 3097044 | emb | CAA75299 | (Y15035) KIR [Cowpox virus] 
gi |2146245|pirj (S73794 hypothetical protein H91_orfl80 - Mycopl . 
gi|83910|pir| |S04682 ribosomal protein varl - yeast (Candida gl . 
gi | 133 13 5 | sp | P213 58 | RMAR_CANGA MITOCHONDRIAL RIBOSOMAL PROTEIN . 
gi j 2128843 |pir| |H64475 hypothetical protein MJ1409 - Methanococ. 
gi|5107017|gb|AAD39926.l|AF126285_2 (AF126285) RNA polymerase (. 
gi|2146210|pir| (S73342 hypothetical protein E07_orfl66 - Mycopl. 



(bits) 


Value 


41 


0.010 


38 


0.053 


38 


0.090 


38 


0.090 


37 


0.15 


37 


0.15 


36 


0.20 


36 


0.35 


35 


0.60 



Database: 



swissprot 

79,449 sequences; 28,874,452 total letters 



Sequences producing significant alignments: 



Score E 
(bits) Value 



sp|P23250 RPI1_YEAST NEGATIVE RAS PROTEIN REGULATOR PROTEIN. 

sp|P21358 RMAR_CANGA MITOCHONDRIAL RIBOSOMAL PROTEIN VAR1 . 

sp|Q21444 LDLC_CAEEL LDLC PROTEIN HOMOLOG. 

sp|P27240 RFAY_ECOLI LIPOPOLYSACCHARIDE CORE BIOSYNTHESIS PROT. 

sp|P53192 YGC0_ YEAST HYPOTHETICAL 27.1 KD PROTEIN IN ALK1-CKB1. 

sp|P32908 SMC1_YEAST CHROMOSOME SEGREGATION PROTEIN SMC1 (DA-B . 

Sp|P54683 TAGB_DICDI PRESTALK-SPECIFIC PROTEIN TAGB PRECURSOR . 

sp|Q03100 CYAA_DICDI ADENYLATE CYCLASE, AGGREGATION SPECIFIC (. 



38 


0 


.014 


37 


0 


.040 


34 


0 


.35 


33 


0 


.46 


33 


0 


.60 


33 


0 


.60 


32 


0 


.78 


32 


0 


.78 
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B LAS TP 2.0.8 [Jan-05-1999] 



Query* sid| 100019 | lan| 77ORF019 Phage 77 ORF| 39851-40501 | 2 
(216 letters) 

Database: nr 

373,355 sequences; 114,214,446 total letters 



Sequences producing significant alignments: 



Score E 
(bits) Value 



gi|3341966|dbj |BAA31932| (AB009866) orf 59 [bacteriophage phi PVL] 437 e-122 

gi|2689911 (AE000792) B. burgdorferi predicted coding region BB . . . 38 0.058 

gi| 1171589 | emb |CAA64574| (X95275) frameshift [Plasmodium falcip. . . 37 0.10 

gi |4493986 j emb j CAB39045 . 1 | (AL034559) predicted using hexExon; ... 36 0.23 

gi|l41257|sp|P18019|YPI9_CLOPE HYPOTHETICAL 14.5 KD PROTEIN (OR... 36 0.29 

gi|l33412|sp|P27059|RPOB_ASTLO DNA- DIRECTED RNA POLYMERASE BETA... 35 0.51 

gi j 3122231 | sp | Q58851 | HISX_METJA HISTIDINOL DEHYDROGENASE (HDH) ... 35 0.51 

gi | 3649757 j emb | CAB11106 .1 1 (Z98547) predicted using hexExon; MA... 34 0.66 

gi|2688313 (AE001146) sensory transduction histidine kinase, pu. . . 34 0.87 



Database .* swissprot 

79,449 sequences; 28,874,452 total letters 



Sequences producing significant alignments: 



YPI9_CLOPE HYPOTHETICAL 14 . 5 KD PROTEIN (ORF9) . 

HISX_METJA HISTIDINOL DEHYDROGENASE (EC 1.1.1.23) (H. 

RPOB_ASTLO DNA- DIRECTED RNA POLYMERASE BETA CHAIN (E. 

CENE_HUMAN CENTROMER I C PROTEIN E (CENP-E PROTEIN) . 

ARP_PLAFA ASPARAGINE-RICH PROTEIN (AG319) (ARP) (FRA. 

IPAB_SHIFL 62 KD MEMBRANE ANTIGEN. 

VTA2JCENLA VITELLOGENIN A2 PRECURSOR (VTG A2) [CONTA. 

CP3H_CAVPO CYTOCHROME P450 3A17 {EC 1.14.14.1) (CYPI. 

RMAR_CANGA MITOCHONDRIAL RIBOSOMAL PROTEIN VAR1 . 

IPAB SHIDY 62 KD MEMBRANE ANTIGEN . 



Sequences 


sp 


P18019 


sp 


Q58851 


sp 


P27059 


sp 


Q02224 


sp 


P04931 


sp 


P18011 


sp 


P18709 


sp 


Q64409 


sp 


P21358 


sp 


Q03945 



Score E 
(bits) Value 



36 . 


0.079 


35 


0.14 


35 


0.14 


34 


0.31 


33 


0.53 


32 


0.69 


32 


0.90 


32 


0.90 


32 


0.90 


32 


1.2 
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B LAS TP 2.0.8 [Jan-05-1999] 

Query= sid | 100043 | lan| 77ORF043 Phage 77 ORF | 29304-29564 | 3 
(86 letters) 

Database: nr 

373,355 sequences; 114,214,446 total letters 



Sequences producing significant alignments: 



Score E 
(bits) Value 



gi|3341947|dbj|BAA31913| (AB009866) orf 39 [bacteriophage phi PVL] 
gi | 744518 |prf | j 2014422A FKBP-rapamycin-associated protein [Homo. 
gi|H69736|sp|P42346|FRAP_RAT FKBP-RAPAMYCIN ASSOCIATED PROTEIN, 
gi|ll69735|sp|P4234sjFRAP_HUMAN FKBP-RAPAMYCIN ASSOCIATED PROTE . 
gi j 3282239 (U88966) rapamycin associated protein FRAP2 [Homo sa. 
gij 3875402 |emb|CAA98122| (Z73906) cDNA EST EMBL:D64544 comes fr. 
gi|l084792|pir| |S54091 hypothetical protein YPR070w - yeast (Sa, 



182 
32 
32 
32 
32 
31 
30 



6e-46 

0.84 

0.84 

0.84 

0.84 

2.5 

4.2 



Database: 



swissprot 

79,449 sequences; 28,874,452 total letters 



Sequences producing significant alignments: 

sp|P42345 FRAP_HUMAN FKBP-RAPAMYCIN ASSOCIATED PROTEIN (FRAP) . 

sp|P42346 FRAP_RAT FKBP-RAPAMYCIN ASSOCIATED PROTEIN (FRAP) (R. 

Sp|P34554 YNP1_CAEEL HYPOTHETICAL, 42.2 KD PROTEIN T05G5.1 IN C. 

Sp|Q24118 LIO_DROME LINOTTE PROTEIN. 

sp|P80034 ACH2_BOMMO ANTICHYMOTRYPSIN II (ACHY- II ) . 

Sp|P22922 A1AT_B0MM0 ANTITRYPSIN PRECURSOR (AT). 

sp|Q44363 TRAA_AGRT6 CONJUGAL TRANSFER PROTEIN TRAA. 

sp|P38255 YBU5_YEAST HYPOTHETICAL 51.3 KD PROTEIN IN PH05-VPS1 . 

sp|P55822 SH3B_HUMAN SH3BGR PROTEIN (21-GLUTAMIC ACID-RICH PRO. 

sp|Q58482 YA82_METJA HYPOTHETICAL PROTEIN MJ1082. 

sp|P34252 YKK8_YEAST HYPOTHETICAL 52.3 KD PROTEIN IN HAP4-AAT1. 



Score 


E 


(bits) 


Value 


32 


0.24 


32 


0.24 


28 


3.5 


28 


3.5 


28 


3.5 


28 


3.5 


28 


3.5 


27 


6.0 


27 


7.9 


27 


7.9 


27 


7.9 
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BLAST P 2.0.8 Uan-05-1999] 

Quory= sid|l00102|lan|77ORF102 Phage 77 ORF|29051-29212 | 2 
(53 letters) 

Database: nr 

373,355 sequences; 114,214,446 total letters 

Score E 

Sequences producing significant alignments: (bits) Value 

gi 1 3341946 |dbj |BAA31912 | (AB009866) orf 38 [bacteriophage phi PVL] 96 3e-20 

gi | 4325288 |gb| AAD17315 | (AF123593) voltage-dependent sodium cha .. . 28 7.1 

gi|2649684 (AE001040) A. fulgidus predicted coding region AF092 . . . 28 9.3 

Database : swissprot 

79,449 sequences; 28,874,452 total letters 

Score E 

Sequences producing significant alignments: (bits) Value 

sp|P42087 HUTM_BACSU PUTATIVE HISTIDINE PERMEASE. 26 7.1 

sp|P04775 CIN2_RAT SODIUM CHANNEL PROTEIN, BRAIN II ALPHA SUBU. . . 26 9.2 

Sp|P42619 YQJF_ECOLI HYPOTHETICAL 17.2 KD PROTEIN IN EXUR-TDCC... 26 9.2 
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BLASTP 2.0.8 [Jan-OS-1999] 



Query* sid| 100104 | lan | 77ORF104 Phage 77 ORF|34393-34551 1 1 
(52 letters) 



Database: nr 



373,355 sequences; 114,214,446 total letters 



Sequences producing significant alignments: 



Score 
(bits) 



gi|2315523 (AF016452) similar to the leucine-rich domains found... 29 
gi|4377168|gb|AAD18990| (AE001666) CT711 hypothetical protein [... 29 
gi| 3882171 | dbj |BAA34445| (AB018268) KIAA0725 protein [Homo sapi . . . 28 



E 

Value 

4.2 
5.4 
9.3 



Database : swissprot 

79,449 sequences; 28,874,452 total letters 



Sequences producing significant alignments: 



Score E 
(bits) Value 



sp[P04879 RRPP_VSVIG RNA POLYMERASE ALPHA SUBUNIT (EC 2.7.7.48. 27 5.4 

sp|P04880 RRPP_VSVIM RNA POLYMERASE ALPHA SUBUNIT (EC 2.7.7.48. 27 5.4 

sp|Q13946 CN7A_HUMAN HIGH- AFFINITY CAMP- SPECIFIC 3 1 , 5 ' -CYCLIC . 26 7.1 

sp|P35381 ATPA_DROME ATP SYNTHASE ALPHA CHAIN, MITOCHONDRIAL P. 26 9.3 

sp|P54659 MVPB_DICDI MAJOR VAULT PROTEIN BETA ( MVP- BETA ) . 26 9.3 

sp|P40397 YHXC_BACSU HYPOTHETICAL OXIDOREDUCTASE IN APRE-COMK . 26 9.3 
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B LAS TP 2.0.8 [ Jan-05- 1999] 

Query= sid | 122748 | lan | 770RF182 Phage 77 ORF| 29268-29564 | 3 
(98 letters) 

Database : nr 

393,678 sequences; 120,452,765 total letters 

Score E 

Sequences producing significant alignments: (bits) Value 

gi|3341947|dbj|BAA31913.l| (AB009866) orf 39 [bacteriophage phi . . 182 8e-46 

gi| 1084792 jpirj (S54091 hypothetical protein YPR070w - yeast (Sa. . 35 0.13 

gi | 1169736 | sp | P42346 | FRAP__RAT FKBP-RAPAMYCIN ASSOCIATED PROTEIN. . 32 1.1 

gi|744518 |prf j |2014422A FKBP-rapamycin-associated protein [Homo.. 32 1.1 

gi j 5051381 |emb|CAB44736.l| (AL049653) dJ647M16.2 (FK506 binding . . 32 1.1 

gi|4826730|ref |NP_004949.l|pFRAPl| FK506 binding protein 12-rap. . 32 1.1 

gi|3282239 (U88966) rapamycin associated protein FRAP 2 [Homo sa. . 32 1.1 

Database : swissprot 

79,909 sequences; 29,054,478 total letters 

Score E 

Sequences producing significant alignments: (bits) Value 

sp|P42345 FRAP_HUMAN FKBP-RAPAMYCIN ASSOCIATED PROTEIN (FRAP) . 32 0.29 

Sp|P42346 FRAP_RAT FKBP-RAPAMYCIN ASSOCIATED PROTEIN (FRAP) (R. 32 0.29 

sp|P40557 YIA5_ YEAST PUTATIVE DISULFIDE ISOMERASE YIL005W PREC. 29 3.3 

sp|Q24118 LI 0 — DROME LINOTTE PROTEIN. 28 4.4 

sp|Q44363 TRAA_AGRT6 CONJUGAL TRANSFER PROTEIN TRAA. 28 4.4 

sp|P80034 ACH2_BOMMO ANTICHYMOTRYPSIN II (ACHY- II ) . 28 4.4 

sp|P34554 YNP1_CAEEL HYPOTHETICAL 42.2 KD PROTEIN T05G5.1 IN C. 28 4.4 

sp|P22922 AlAT_BOMMO ANTITRYPSIN PRECURSOR (AT). 28 4.4 
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Table 6 



1st 










3rd 


position 




2nd position 




position 


(5* end) 










(3* end) 




U 


C 


A 


G 






Phe 


Ser 


Tyr 


Cys 


U 




Phe 


Ser 


Tyr 


Cys 


c 


u 


Leu 


Ser 


Stop 


Stop 


A 




Leu 


Ser 


StOD 


Trp 


G 




Leu 


Pro 


His 


Arg 


U 




Leu 


Pro 


His 


Arg 


c 


c 


Leu 


Pro 


Gin 


Arg 


A 




Leu 


Pro 


Gin 


Ara 


G 




lie 


Thr 


Asn 


Ser 


U 




lie 


Thr 


Asn 


Ser 


C 


A 


lie 


Thr 


Lys 


Arg 


A 




Met 


Thr 


Lvs 


. Arq 


G 




Val 


Ala 


Asp 


Gly 


U 




Val 


Ala 


Asp 


Gly 


c 


G 


Val 


Ala 


Glu 


Gly 


A 




Val 


Ala 


Glu 


Glv 


G 
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Table 7 

Bacteriophage 3A, complete genome sequence 

1 caaacgctag caacgcggat aaatttttca tgaaaggggg tctttatatg aagttaacaa aaaaacagct 

71 aaaagaatat atagaagatt acaaaaaatc tgatgacata ttaattaatt tgtatataga aacatatgaa 

141 ttttattgtc ggttaagaga tgaacttaaa aatagtgatt taatgataga gcatacaaac aaggctggtg 
211 cgagcaatat tattaagaat ccattaagca tagaactgac aaaaacagct caaacactaa ataacttact 

281 caagtctatg ggtttaactg cagcacaaag aaaaaagata gttcaagaag aaggtggatt cggtgactat 

351 taaagtttta aatgaacctt caccaaaact attaacaaca tggtatgcag agcaagtcac tcaagggaaa 

421 ataaaaacaa gcaaatatgt tagaaaagaa tgtgagagac atcttagata tctagaaaat ggaggtaaat 

491 gggtatttga tgaagaatta gcgcatcgtc ctattcgatt tacagaaaag ttttgtaaac cttccaaagg 

561 atctaaacgt caacttgtat tacagccatg gcaacatttt attatcggca gtttgtttgg ttgggttcat 

631 aaagaaacaa aactgcgcag gtttaaagaa gctttgatat ttatggggcg aaaaaatggt aaaacaacca 

701 ctatttctgg ggttgctaac tatgctgtat cacaagatgg agaaaatggt gcagaaattc atttgttagc 

771 aaacgtaatg aaacaagcta ggattctatt tgatgaatct aaggcgatga ttaaagctag cccaaagctt 

841 gataaaaatt tcagaacatt aagagatgaa atccattatg acgcaacgat atcaaaaatt atgccccaag 

911 catcagatag cgataagtta gatggattga atacacacat ggggattttt gatgaaattc atgaatttaa 

981 agactataaa ttgatttcag ttataaaaaa ctcaagagct gcaaggttac aacctcttct catctacatt 

1051 acgacagcag ggtatcaatt agatggtcca cttgttgata tggtagaagc gggaagagac acctt agate 

1121 aaatcataga agacgaaaga actttttatt at tt age ate tttggatgat gacgatgata ttaatgattc 

1191 gtcgaactgg ataaaagcaa atcccaactt aggtgtctct ataaatttag atgagatgaa agaagagtgg 

1261 gaaaaagcta agagaacacc agetgaaegt ggagatttta taaccaaaag gtttaatatc tttgetaata 

1331 atgacgagat gagttttatt gattacccaa cactccaaaa aaataatgaa attgtttctt tagaagagct 

1401 ggaaggcaga ccgtgcacga ttggttatga tttatcagaa acagaggact ttacagccgc gtgtgctact 

1471 tttgcgttag ataatggtaa agttgcagtt ttategcatt catggattcc taagcacaaa gttgaatatt 

1541 ctaacgaaaa aataccctat agagaatggg aagaagatgg cttattaaca gtgeaagata agecttatat 

1611 tgactaccaa gatgttttaa attggataat taagatgaat gagcattatg tagtagaaaa aattacttat 

1681 gatagagega aegcattcaa actaaatcaa gagttaaaaa attaegggtt tgaaacggaa gaaacaagac 

1751 aaggagcttt gaccttgagc ectgeattga aggatttaaa agaaatgttt ttagatggga aaataatatt 

1821 taataataat cctttaatga aatggtatat caataatgtt cagttgaaac tagacagaaa eggaaactgg 

1891 ttgeegtcta agcaaagcag atategtaaa atagatggct ttgeagcatt tttaaacaca tatacagata 

1961 ttatgaataa agttgtttct gatagtggtg aaggaaacat agagtttatt agtattaaag acataatgeg 

2031 ttaaggaggt gaatgttatc gcaaaagaga atattgtcac aegcataaag aaaaaattga tagacaattg 

2101 gattgatcag tcaacttcta agctttatga ctttagccca tggaaaaata gatctttttg gggtgtaatt 

2171 aataataege ttgaaactaa tgaaacgata ttttcagcta ttacaaagtt atctaattcg atggctagtt 

2241 tgcccttgaa aatgtatgaa gattataaag tagttaatac agaagtatct gatttactta cagtgtcacc 

2311 gaataattct ctgagcagtt ttgattttat taatcaaatt gaaacaatca gaaatgaaaa aggtaatgea 

2361 tatgtgctaa ttgaacgaga catctatcat caaccatcaa agcttttctt attaaatcca gatgttgttg 

2451 aaatgttaat tgaaaaccaa teaegtgaac tttattattc cattcatget gcaactggaa ataaattgat 

2S21 tgttcataat atggacatgt tgcattttaa acacategtg gcatctaata tggtgcaagg cattagtccg 

2591 attgatgtgt tgaagaatac aactgatttt gataatgeag taagaacctt taatcttaca gaaatgcaaa 

2661 aacctgattc tttcatgett aaatatggtt ccaatgtagg taaagaaaaa aggcagcaag tgttagaaga 

2731 tttcaaacag tactatgaag aaaacggtgg aatattattc caagagectg gtgttgaaat egaacegtta 

2801 cctaaaaaat atgtctctga agatatagtg geaagegaga atttaacaag agaaagagta gctaacgttt 

2871 ttcaattgee ctcagtattc ttaaatgcaa gatcaaatac aaatttcgcg aaaaatgaag agttaaacag 

2941 attttacttg cagcatacct tattgecaat cgtcaaacag tatgaagaag aatttaatcg gaaactactt 

3011 actaaaacag acagagaaaa aaataggtat tttaaattta aegttaaace ttatttaagg gctgatagtg 

3061 caacacaagc agaagtgtac tttaaagcag ttcgtagtgg ttactacact ataaatgaca ttagagagtg 

3151 ggaagattta ccaccagttg aaggtggaga taagcegcta ataageggtg atttataccc aattgacacg 

.3221 ccacttgaat taagaaaatc tttgaaaggt ggtgataaaa atgtcaatga aagctaagta ttttcaaatg 

3291 aaaagaaaat caaaaagtaa aggtgaaata tttatttatg gtgatattgt aagtgataaa tggtttgaaa 

3361 gtgatgtaac tgctacagat ttcaaaaaca aactagatga actaggagac atcagtgaaa tagatgttca 

3431 tataaattca tctggaggca gtgtatttga agggcatgea atatacaata tgctaaaaat gcatcctgca 

3501 aaaattaata tctatgtcga tgccttagcg gcatcaattg ctagtgttat cgctatgagt ggtgacacta 

3571 tttttatgea caaaaatagt tttttaatga ttcataattc atgggttatg actgtaggta atgeagaaga 

3641 gttaagaaag acageggatt tacttgaaaa aacagatget gttagtaatt cagcttattt agataaagca 

3711 aaagatttag atcaagaaca cttaaaacag atgttagatg cagaaacttg gettactgea gaagaagect 

3781 tgtctttegg cttgatagat gaaattttag gagctaatga aataactget agtatctcta aagagcaata 

3851 taagcgtttc gagaaegtec cagaagattt aaagaaagat gtagacaaaa tcactaaaat cgatgatgta 

3921 gatacgtttg aattggttga aacacctaaa gaaagtatgt cactagaaga aaaagaaaaa agagaaaaaa 

3991 ttaaacgega atgegaaatt ttaaaaatga caatgagtta ttaggaggaa atgaaatgee gacattatat 

4061 gaattaaaac aatccttagg tatgattgga caacaattaa aaaataaaaa tgatgaattg agtcagaaag 

4131 caacagaccc aaatattgat atggaagaca tcaaacaact agaaacagaa aaagcaggct tacaacaaag 

4201 atttaacatt gttgaaagac aagtaaaaga cattgaagaa aaagaaaaag cgaaagttaa agacacagga 

4271 gaagcttatc aatctttaaa tgatcatgag aagatggtta aagctaaggc agagttttat cgtcacgcga _ 

4341 ttttaccaaa tgaatttgaa aaaccttcaa tggaggcaca aegtttatta cacgctttac caacaggtaa 

4411 tgattcaggt ggtgataagc tcttaccaaa aacactttct aaagaaattg tttcagaacc atttgetaaa 

44 81 aaccaattac gtgaaaaagc tcgtctaacc aacattaaag gtttagagat tccaagagtt tcatatactt 

4551 tagacgatga tgacttcatt acagatgtag aaacagcaaa agaattaaaa ttaaaaggtg atacagttaa 

4621 attcaccact aataaattca aagtatttgc tgeaatttea gatactgtaa ttcatggatc agatgtagat 

4691 tcagtaaact gggttgaaaa cgcactacaa tcaggtctag cagctaaaga aegtaaagat gecttagcag 
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4761 taagtcctaa atctggatta gatcacatgt catcctacaa tggatctgtt aaagaagttg agggagcaga 

4831 catgtatgat gctattatta acgctttagc agatttacat gaagattacc gtgataacgc aacaatttat 

4901 atgcgatatg cggattatgt caaaattatt agtgttcttt caaatggaac aacaaatttc tttgacacac 

4971 cagcagaaaa agtatttggc aaaccagtag tatttacaga tgcagcagtt aaacctattg tgggagattt 

5041 caattatttt ggaattaact atgatggaac aacttatgac actgataaag atgttaaaaa aggcgaatat 

5111 ttgtttgtat taactgcatg gtatgatcag caacgtacat tagacagtgc attcagaatt gcaaaagcaa 

5181 aagaaaatac aggttcatta cccagctaag ccccaaaagg ttaatgtaac agctaaggct aaatcagctg 

5251 taatatcagc cgaatagggg tgatgaaatg agtttagaag aaattaaatt gtggttgaga attgactata 

5321 atttcgaaaa tgatttaatt gaaggtctca ttcaatcggc taagtctgaa ttactattaa gtggggttcc 

5391 agattatgac aaagatgact tggaataccc gcttttttgt acagcgatta gatatatcat tgcaagagat 

5461 tatgaaagtc gtgggtactc aaatgaccaa tctagaagca aggtttttaa tgaaaaggga ttgcaaaaaa 

5531 tgattctgaa attaaaaaag tggtaggtga tttttaaatg gaatttaatg aatttaaaga tcgcgcatat 

5601 ttttttcaat atgtaaataa agggccgtat ccagatgaag aggaaaaaat gaagttgtat agttgctttt 

5671 gtaaaatata taatccttct atgaaagata gagaaatttt aaaagcgact gaatcaaagt caggactaac 

5741 cataattatg aggtcttcta aaattgaata tctaccacaa acaaatcact tagttaaaat tgacagaggc 

5811 ttatattccg ataaattatt caacattaaa gaaataagaa ttgatacacc agatattggc tataatacag 

5881 tggttttatc agaaaaatga gtgtagaaat taaagggata cctgaagtgt tgaagaaatt agaatcggta 

5951 tacggtaaac aatcaatgca agctaagagt gatagagctt taaatgaagc atctgaattt tttataaagg 

6021 ctttaaagaa agaattcgag agttttaaag atacgggtge tagcatagaa gaaatgacta aatctaagcc 

6091 ttatacaaaa gtaggaagtc aagaaagagc tgttttaatt gaatgggtag gccctatgaa tcgcaaaaac 

6161 attattcact tgaatgaaca tggttataca agagatggaa aaaaatatac accaagaggt tttggagtta 

6231 ttgcaaaaac attagctgct aatgaacgga agtatagaga aattataaaa aaggagttgg ccagataaat 

6301 gaatatatta aacaccataa aagaaatttt attatctgat gcagagctcc aaacatatat aaattctaga 

6371 atatactatt ataaagtcac tgaaaatgct gaaacttcca aaccttttgt tgttattaca cctatttatg 

6441 atttaccttc agacttcatg tctgataaat atcttagtga agaatactta attcaaatag atgtagaatc 

6511 ttcaaataat cagaaaacaa ttgatataac aaaacgaata agatatctgt tatatcaaca aaatttaatt 

6581 caagcatcta gtcagttaga tgcttatttt gaagaaacta aacgttatgt gatgtcgaga cgctatcaag 

6651 gcataccaaa aaatatatat tataaaaatc agcgcatcga ataggtgtgc tttttaattt ttaaggagga 

6721 aataagcaat ggcagaagga caaggttctt ataaagtagg ttttaaaaga ttatacgttg gagtttttaa 

6791 cccagaagca acaaaagtag ttaaacgcat gacatgggaa gatgaaaaag gtggtacagt tgatctaaat 

6861 atcacaggtt tagcaccaga tttagtagat atgtttgcat ctaacaaacg tgcttggatg aaaaaacaag 

6931 gtactaatga agttaagtct gacatgagta tttttaatat tccaagtgaa gatctaaata cagttattgg 

7001 tcgttctaaa gataaaaatg gtacatcttg ggtaggagag aatacaagag caccatacgt aacagttatt 

7071 ggagaatctg aagatggttt aacaggtcaa ccagtgtacg ttgcgctact taaaggtact tttagcttgg 

7141 attcaattga atttaaaaca cgaggagaaa aagcagaagc accagagcca acaaaattaa ctggtgactg 

7211 gatgaacaga aaagttgatg ttgatggtac tccacaaggt attgtatacg ggtatcatga aggtaaagaa 

7281 ggagaagcag aattcttcaa aaaagtattc gttggataca cggacagtga agatcattca gaggattctg 

7351 caagttcgtt acccagctaa cccccaaaat gttgaagtag cagttaattc aaaatctgca acagtttcag 

7421 cagaataggg gctttcaaaa taaatcaaag gagaataatt tatgactaaa actttaaagg tttataaagg 

7491 agacgacgtc gtagcttctg aacaaggtga aggcaaagtg tcagtaactt tatctaattt agaagcggat 

7561 acaacttatc caaaaggtac ttaccaagtg gcatgggaag aaaatggtaa agaatctagt aaagttgatg 

7631 tacctcaatt caaaaccaat ccaattctag tctcaggcgt atcatttaca cccgaaacta aatcaatcac 

7701 ggtaaatgct gatgacaatg ttgaaccaaa cattgcacca agtacagcaa cgaataaaac gttgaaatat 

7771 acaagtgaac atccagagtt tgttactgtt gatgagagaa caggagcaat tcacggtgta gctgagggaa 

7841 cttcagttat cactgctacg tctactgacg gaagtgacaa gtctggacaa attacagtaa cagtaacaaa 

7911 tggataatta tttgagacgc agaatatctg cgtctttttt atttgaataa aaggagctaa tacaatgatt 

7981 aaatttgaaa ttaaagaccg taaaacagga aaaacagaga gctatacaaa agaagatgtg acaatgggcg 

8051 aagcagaaaa atgctatgag tatttagaat tagtaaatca agagaataaa aaagaagtac ctaacgcaac 

8121 aaaaatgaga caaaaagagc gacagttatt agtagattta tttaaagatg aaggattgac tgaagaagat 

6191 gttttgaaca agatgagcac taaaacttat acaaaagcct tgaaagatat atttcgagaa atcaatggtg 

8261 aagatgaaga agattcagaa actgaaccag aagagatggg aaagacagaa gaacaatctc aataaaagat 

8331 attttatcga acattaagaa aatacaacgt ttctgtatgg agcagtatgg gtggacatta actgaagtca 

8401 gaaaacagcc gtatgtaaaa cttttagaaa tacttaatga agagaataaa gaagagactg aagaaaaaca 

8471 aagtgaacaa aaagtcatta caggtacgga tttaagaaaa ctttttggaa gctagaaagg aggttaatat 

8541 gaatgaaaaa gtagaaggca tgaccttgga gctgaaatta gaccatttag gtgtccaaga aggcatgaag 

8611 ggtttaaagc gacaattagg tgttgttaat agtgaaatga aagctaatct gtcatcattt gataagtctg 

8681 aaaaatcaat ggaaaagtat caggcgagaa ttaaggggtt aaatgataag cttaaagttc aaaaaaagat 

8751 gtattctcaa gtagaagatg agcttaaaca agttaacgct aattatcaaa aagctaaatc tagtgtaaaa 

8821 gatgttgaga aagcatattt aaagctagta gaagctaata aaaaagaaaa attagctctt gataaatcta 

8891 aagaagcctt aaaatcttcg aatacagaac ttaaaaaagc cgaaaatcaa tataaacgta caaatcaacg 

8961 taaacaagat gcatatcaaa aacttaaaca gttgagagat gcagaacaaa agcttaagaa tagtaaccaa 

9031 gctactactg cacaactaaa aagagcaagt gacgcagtac agaagcagtc cgctaagcat aaagcacttg 

9101 ttgaacaata taaacaagaa ggcaatcaag ttcaaaaact aaaagtacaa aatgataatc tttcaaaatc 

9171 aaacgaaaaa atagaaaatt cttacgctaa aactaatact aaattaaagc aaacagaaaa agaatttaat 

9241 gatttaaata atactattaa gaatcatagc gctaatgtcg caaaagctga aacagctgtt aacaaagaaa 

9311 aagctgcttt aaataattta gagcgttcaa tagataaagc ttcatccgaa atgaagactt ttaacaaaga 

9381 acaaatgata gctcaaagtc atttcggcaa acttgctagt caagcggatg tcatgtcaaa gaaatttagt 

9451 tctattggag ataaaatgac ttccctagga cgtacgatga cgatgggcgt atctacaccg attactttag 

9521 ggttaggtgc agcattaaaa acaagtgcag acttcgaagg gcaaatgtct cgagttggag cgattgcaca 

9591 agcaagcagt aaagacttaa aaagcatgtc taatcaagcg gttgacttag gcgctaaaac aagtaaaagt 

9661 gctaacgaag ttgctaaagg tatggaagaa ttggcagctt taggctttaa tgccaaacaa acaatggagg 

9731 ctatgccggg tgttatcagt gcagcagaag caagcggtgc agaaatggct acaactgcaa ctgtaatggc 

9801 atcagcaatt aattctttcg gtttaaaagc atctgatgca aaccatgttg ctgatttact tgcgagatca 

9871 gctaatgata gtgctgcaga tattcaatac atgggagatg cattaaaata tgcaggtact ccagcaaaag 

9941 cattaggagt ttcaatagag gacacttctg cagcaattga agttttatct aactcagggt tagaggggtc 

10011 tcaagcaggt actgcattaa gagcttcgtt tattaggcta gctaatccaa gtaaaagtac agctaaggaa 
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10081 atgaaaaaat taggtattca tttgtctgat gctaaaggtc aatttgttgg catgggtgaa ttgattagac 

10151 agttccaaga caacatgaaa ggcatgacga gagaacaaaa accagcaaca gtggctacaa tagttggcac 

10221 tgaagcagca agtggatttt tagccttgat tgaagcgggt ccagacaaaa ttaatagcta tagcaaatca 

10291 ttgaagaact ctaatggtga aagtaaaaaa gcagctgatt tgatgaaaga caacctcaaa ggtgctctgg 

10361 aacaattagg tggcgctttt gaatcgttag caattgaagt tggtaaagat ttaacgccta tgattagagc 

10431 aggtgcggaa ggattaacaa aattagttga tggatttaca catcttcctg gttggtttag aaaggcttcg 

10501 gtaggtttag cgatttttgg tgcatctatt ggccctgctg ttcttgctgg tggcttatta atacgtgcag 

10571 ttggaagcgc ggctaaaggc tatgcatcat taaatagacg cattgctgaa aatacaatac tgtctaatac 

10641 caattcaaaa gcaatgaaat ctttaggtct tcaaacctta tttcttggtt ctacaacagg aaaaacgtca 

10711 aaaggcttta aaggattagc cggagctatg ttgtttaatt caaaacctat aaatgttttg aaaaattctg 

10781 caaagctagc aattttaccg ttcaaacttt tgaaaaacgg tttaggatta gccgcaaaat ccttatttgc 

10851 agtaagtgga ggcgcaagat ttgctggtgt agccttaaag tttttaacag gacctatagg tgctacaata 

10921 actgctatta caattgcata taaagttttt aaaaccgcat atgatcgtgt ggaatggttc agaaacggta 

10991 ttaacggttt aggagaaact ataaagtttt ttggtggcaa aattattggc ggtgctgtta ggaagctagg 

11061 agagtttaaa aattatcttg gaagtatagg caaaagcttc aaagaaaagt tttcaaagga tatgaaagat 

11131 ggttataaat ctttgagtga cgatgacctt ctgaaagtag gagtcaacaa gtttaaagga tttatgcaaa 

11201 ccatgggcac agcttctaaa aaagcatctg atactgtaaa agtgttgggg aaaggtgttt caaaagaaac 

11271 agaaaaagct ttagaaaaat acgtacacta ttctgaagag aacaacagaa tcatggaaaa agtacgttta 

11341 aactcgggtc aaataacaga agacaaagca aaaaaacttt tgaaaattga agcggattta tctaataacc 

11411 ttatagctga aatagaaaaa agaaataaaa aggaactcga aaaaactcaa gaacttattg ataagtatag 

11481 tgcgttcgat gaacaagaaa agcaaaacat tttaactaga actaaagaaa aaaatgactt gcgaattaaa 

11S 51 aaagagcaag aactcaatca gaaaatcaaa gaattgaaag aaaaagcttt aagtgatggt cagatttcag 

11621 aaaatgaaag aaaagaaatt gaaaagcttg aaaatcaaag acgtgacatc actgttaaag aattgagtaa 

11691 gactgaaaaa gagcaagagc gtattttagt aagaatgcaa agaaacagaa atgcttattc aatagacgaa 

11761 gcgagcaaag caattaaaga agcagaaaaa gcaagaaaag caagaaaaaa agaagtggac aagcaatatg 

11831 aagatgatgt cat t get at a aaaaataacg tcaacctttc taagtctgaa aaagataaat tattagctat 

11901 tgctgatcaa agacataagg atgaagtaag aaaggcaaaa tctaaaaaag atgctgtagt agacgttgtt 

11971 aaaaagcaaa ataaagatat tgataaagag atggatttat ccagtggtcg tgtatataaa aatactgaaa 

12041 agtggtggaa tggccttaaa agttggtggt ctaacttcag agaagaccaa aagaagaaaa gtgataagta 

12111 cgctaaagaa caagaagaaa cage teg tag aaacagagaa aatataaaga aatggtttgg aaatgcttgg 

12181 gaeggegtaa aaactaaaac tggegaaget tttagtaaaa tgggcagaaa tgctaatcat tttggcggcg 

12251 aaatgaaaaa aatgtggagt ggaatcaaag gaattccaag caaattaagt tcaggttgga gctcagccaa 

12321 aagttctgta ggatatcaca ctaaggctat agctaatagt actggtaaat ggtttggaaa agcttggcaa 

12391 tctgttaaat cgactacagg aagtatttac aatcaaacta agcaaaagta ttcagatgee tcagataaag 

12461 ettgggegea ttcaaaatct atttggaaag ggacatcaaa atggtttagc aatgeatata aaagtgcaaa 

12531 gggctggcta aeggatatgg ctaataaatc gcgctcgaaa tgggataata tttctagtac agcatggtcg 

12601 aatgcaaaat ccgtttggaa aggaacatcg aaatggttta gtaactcata caaatcttta aaaggttgga 

12671 ctggagatat gtattcaaga gcccacgatc gttttgatgc aatttcaagt tcggcatggt etaaegctaa 

12741 atcagtattt aatggtttta gaaaatggct atcaagaaca tatgaatgga ttagagatat tggtaaagac 

12811 atgggaagag ctgcggctga tttaggtaaa aatgttgcta ataaagctat tggcggttta aatagcatga 

12881 ttggcggtat taataaaata tetaaageca ttactgataa aaatctcatc aagecaatae ctacattgtc 

12951 tactggtact ttagcaggaa agggtgtagc taccgataat tegggagcat taacgcaacc gaeatttget 

13021 gtattaaatg atagaggttc tggaaacgee ccaggtggtg gagttcaaga agtaattcac agggctgacg 

13091 gaacattcca tgcaccccaa ggacgagatg tggttgttcc actaggagtt ggagatagtg taataaatgc 

13161 caatgacact ctgaagttac agcggatggg tgttttgcca aaattccatg gtggtacgaa aaagaaagat 

13231 tggctagacc aacttaaagg taatataggt aaaaaagcag gagaatttgg agctacagct aaaaacacag 

13301 cgcataatat caaaaaaggt gcagaagaaa tggrtgaagc ageaggegat aaaatcaaag atggtgcatc 

13371 ttggttaggc gataaaatcg gcgatgtgtg ggattacgta caacatccag ggaaactagt aaataaagta 

13441 atgtcaggtt taaatattaa ttttggaggc ggactaaege tacagtaaaa attgetaaag gcgcgtactc 

13511 attgetcaaa aagaaattaa tagacaaagt aaaatcgtgg tttgaagatt ttggtggtgg aggcgatgga 

13581 agctatctat ttgaatatcc aatctggcaa agatttggac gctacacagg tggacttaac tttaatgacg 

13651 gtegtcacta tggtatagac tttggtatgc ctactggaac aaacgtttat gccgttaaag gtggtatagc 

13721 agataaggta tggactgatt acggtggcgg taattctata caaattaaga ccggtgctaa cgaatggaac 

13791 tggtatatgc atttatctaa gcaattagca agacaaggee aacgtattaa agctggtcaa ctgataggga 

13861 aatcaggtgc tacaggtaat ttcgttagag gagcacactt acatttccaa ttgatgcaag ggtcacatcc 

13931 agggaatgat acagctaaag atccagaaaa atggttgaag tcacttaaag gtagtggcgt tcgaagtggt 

14001 tcaggtgtta ataaggctgc atetgettgg geaggegata tacgtcgtgc agcaaaacga atgggtgtta 

14071 atgttacttc gggtgatgta ggaaatatca ttagcttgat tcaacacgaa tcaggaggaa atgeaggtat 

14141 aactcaatct agttcgctta gagacatcaa cgttttacag ggcaatccag caaaaggatt gcttcaatat 

14211 atcccacaaa catttagaca ttatgctgtt agaggtcaca acaatatata tagtggttac gatcagttat 

14281 tagegttett taacaacaga tattggeget cacagtttaa cccaagaggt ggttggtctc caagtggtcc 

14351 aagaagatat gcgaatggtg gtttgattac aaagcatcaa ettgetgaag tgggtgaagg agataaacag 

14421 gagatggtta tccctttaac tagaegtaaa cgagcaattc aattaactga acaggttatg cgcatcatcg 

14491 gtatggatgg caagecaaat aacatcactg taaataatga tacttctaca gttgaaaaat tgttgaaaca 

14561 aattgttatg ttaagtgata aaggaaataa attaacagat gcattgattc aaactgtttc ttctcaggat 

14631 aataacttag gttctaatga tgcaattaga ggtttagaaa aaatattgtc aaaacaaagt gggcatagag 

14701 caaatgeaaa taattatatg ggaggtttga ctaattaatg caatcttttg taaaaatcat agatggttac 

14771 aaggaagaag taataacaga ttttaatcag cttatatttt tagatgeaag ggctgaaagt ccaaacacca 

14841 atgataacag tgtaactatt aacggagtag atggtatttt accgggcgca attagttttg cgcctttttc 

14911 attagtatta aggtttggct atgatggtat agatgttata gatttaaatt tatttgagca ttggtttaga - 

14981 tctgtgttta ategcagaca tccttattat gttattactt ctcaaatgcc tggtgttaaa tatgcagtga 

15051 atacagctaa tgttacatct aatttaaaag atggttcttc aactgaaatt gaagtaagtt taaatgttta 

15121 taaagggtat tctgaatcag ttaattggac egatagegag ttcttattcg actctaattg gatgtttgaa 

15191 aatggaattc ctcttgattt cacacctaaa tatactcata catcaaatca atttactatt tggaacggtt 

15261 c tact gat ac gataaatcca cgattcaagc acgatttgaa aatattaatt aatttaaatg cgagtggagg 

15331 atttgaactg gttaactata caacaggtga tatttttaag tacaacaaaa gtatagataa aaacactgat 
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15401 tttgttttag atggtgtgta tgcatatcga gatataaata gagtgggaat tgatacaaat agaggcatta 

15471 taacattagc gccaggtaaa aatgaattta agattaaagg agacatcagt gatattaaaa ctacatttaa 

1SS41 gcctcctttt atttataggt aggtgattta atggattatc atgatcattt atcagtaatg gattttaatg 

15611 aattgatttg tgaaaattta ctagatgtag attatggttc ttttaaagaa tattatgaac tgaatgaagc 

15681 taggtacatc acttttacag tttatagaac tactcataat agttttgttt tcgatttact aatttgtgaa 

15751 aacttcataa tttatcatgg tgaaaaatac acaatcaagc agacagcgcc aaaggttgaa ggtgataaag 

15821 tttttattga agttacggca tatcacataa cgtatgaatc tcaaaatcac tcagtggaat caaataagct 

15891 tgatgacgac agtagcgaaa ctggtaaaac gccagaatac tctttagatg agtacttaag atatggattt 

15961 gcaaatcaaa aaacttcggt caaaatgacc tataaaataa ttggaaattt taagcgaaaa gtaccgattg 

16031 acgaattagg taacaaaaac ggcttagaat actgtaaaga agcggtagac ctatttggct gtataattta 

16101 cccaaatgat acggagatat gtttttattc tcctgaaaca ttttatcaaa gaagcgagaa agtgattcga 

16171 tatcaatata atactgatac tgtatctgca actgtcagta cattggaatt aagaacagct ataaaagttt 

16241 ttggaaaaaa gtatacagct gaggaaaaga aaaatcataa tcctattaga acaactgaca ttaaatattc 

16311 aaatggtttt ataaaagaag gtacttatcg taccgcaaca attgggtcta aagctactat taactttgat 

16381 tgcaagtatg gtaatgaaac agttagattt acaataaaaa agggctctca aggtggaaca tataagttga 

16451 ttttagacgg caagcaaatt aagcaaattt cttgttttgc taagtcggtt cagtctgaaa caatagattt 

16521 aataaaaaat attgataaag gcaagcacgt tttagaaatg atatttttag gagaagaccc caaaaataga 

16591 attgatatat cttcaaataa aaaagctaag ccttgtatgt atgttggaac tgaaaaatca acagtcttaa 

16661 atttaattgc tgacaactca ggtcgcaatc aatacaaagc aattgttgac tacgtcgcag atagtgcaaa 

16731 gcagtttggg attcgatatg ctaatacgca aacaaatgaa gatatcgaaa cacaggataa gctgttagaa 

16801 tttgcaaaaa agcaaataaa tgatactcct aagactgaat tagatgttaa ttatataggt tatgaaaaaa 

16871 tagagccaag agatagcgta ttctttgttc atgaattaat gggatataac actgaattaa aggttgttaa 

16941 acttgatagg tcacatccat ttgtaaacgc aatagatgaa gtgtctttca gcaatgaaat aaaggatatg 

17011 gtacaaattc aacaagcgct taacagacga gttattgcac aagataatag atataactat caagcaaatc 

17081 gtataaatca tttatacact agtactttga attctccttt cgagacaatg gatataggga gtgtattaat 

17151 ataatggcaa cagaagaagt taaaatcaaa gcgctacttg aaaacgataa acagtacttt ccagctacac 

17221 attggaaagc tataaatggg ataccttatg caggcagtag tgatattgat ggattgcctc aagacggtat 

17291 catttcggta gatgataaaa ataaattaga taatttaaaa ataggcgaag caggaattat tcaaaatagc 

17361 attgtacaga aatccccaaa cggtaaattg tggaaaataa cagttgacga tagtgggaaa cttggtacag 

17431 tgctatttta ttagaaagga aggtgcatta tggaaaattt gtatttaata aaggatttgg gagctttagc 

17501 aggtcgagat tatagagcta aggaaataca aaacttacaa agaatagagc aatttgcgct tggcttgaca 

17571 acagagttta agttgcatca gaaagctaaa acaattcaac acttcgctga gcaaatttat tataatggta 

17641 gatcgcaagc agcagtaaac aaatctttac aaagtcaaat taacgcactt gttgtggcac cacgtaataa 

17711 cagtgctaat gagattgttc aagctcgagt taatgtaaac ggcgaaacct ttgacacatt aaaagaacat 

17781 ttagacgatt gggaaaccca aactcaaatt aataaagagg aaactataag agaattaaat aagaccaaac 

17851 aagaaattct tgatatcgag tatcgttttg aacctgataa gcaagaattt ttatttgtga cagaacttgc 

17921 acctcttaca aatgcagtaa tgcaatcctt ctggtttgat aatagaacag gcatagtata catgacacaa 

17991 gctagaaata atggctatat gctaagtcgt ctaagaccta atggtcaatt tatagacagc tcattgattg 

18061 taggtggggg tcatggtaca cataacggtt atagatatat tgatgatgag ttatggattt atagttttat 

18131 cctaaatggt aataatgaga atacattagt tcgtttcaag tatacgccta atgtggaaat tagctatggc 

18201 aagtatggta tgcaagatgt atttacagga cacccagaaa aaccctacat cacccctgtc ataaatgaaa 

18271 aagaaaataa aattctatac agaattgaga gacctagaag tcactgggaa cttgaaaact caatgaatta 

18341 tatagagata agaagtttag acgatgttga taaaaatatt gataaagttt tgcataaaat cagtatccct 

184 11 atgagactaa caaacgaaac ccaaccaatg cagggtgtga cttttgatga aaaatacttg tattggtata 

18481 caggagacag taatccaaat aatagaaact atttaacggc tttcgattta gaaacaggag aagaagcgta 

18551 tcaggttaat gctgactatg gtggaacact agattcattt cctggcgaat ttgcggaagc agaaggtttg 

18621 caaatatact atgacaaaga tagtggtaaa aaagctttga tgctaggtgt tactgtcggt ggtgatggaa 

18691 atagaacaca tcgtattttc atgattgggc aaagaggtat tttagaaata cttcactcaa gaggcgttcc 

18761 ttttatcatg agtgacacag gtggtagagt taaaccttta ccaatgaggc ctgataaact taagaatctt 

18831 gggatgttaa cagagccagg tctttactat ttatacactg atcatacagt tcaaatcgat gatttcccat 

18901 taccaagaga atggcgtgaC gcaggttggt tcttggaagt taagccacca caaactggcg gtgatgtaat 

18971 tcagatattg acgcgtaata gttatgcaag gaatatgatg acttttgaaa gggtgctttc tggaagaact 

19041 ggagacattt cggactggaa ttatgtgcct aaaaatagtg gtaaatggga gagagtacct tcattcatca 

19111 caaaaatgtc agatattaac atagtaggca tgtcgtttta tttaactacg gatgatacaa aacgttttac 

19181 agattttcca actgaacgta aaggggtagc tggttggaac ttatatgtag aagcttcaaa cacaggtggc 

19251 tttgttcata ggctagttcg taatagtgtt acagcatctg ctgagatact attgaaaaat tatgatagta 

19321 aaacaagttc agggccatgg actttacacg aagggagaat tataagttaa tgagtaattt agagaaatct 

19391 gtagctataa atttagaaaa cacagcgcat tatgaaaata tttcaaatct agatataact tttagaacag 

19461 gagagagtga ttcttctgtt cttcttttta atatcactaa aaataatcaa ccgttattat tgagtgaaga 

19531 aaatatcaaa gcacgaatag cgattcgagg taaaggagtc atggtagttg ctccactaga aatattagat 

19601 ccatttaaag gtattttaaa atttcaatta cctaatgatg taattaaacg agatggaagt tatcaagctc 

19671 aagtttcggt tgcagaatta ggtaattcag acgtggtagt tgtcgagaga actatcacat ttaacgttga 

19741 aaaaagtttg tttagcatga ttccatctga aacaaaatta cactatattg ttgaatttca ggaattagaa 

19811 aaaactatta tggatcgtgc gaaagcaatg gacgaggcta taaaaaatgg tgaagattat gcgagtctga 

19881 ttgaaaaagc taaagaaaaa ggtctatcag atattcaaat agcaaaatct tcaagtatag atgaattaaa 

19951 gcaacttgct aatagccata tatctgattt ggaaaataaa gcgcaagcat attcaagaac attcgatgag 

20021 caaaagcgat atatggatga gaaacatgaa gccttcaagc agtcagtgaa tagtggtggt ttagtcacaa 

20091 gtggttctac ttcaaattgg caaaaagcta agattactaa agatgatggt aagataatgc agattactgg 

20161 atttgatttt aataatccag aacaaagaat aggtgattca acccaattta tttatgtttc gcaagctata 

20231 aactatccaa gaggtgttag tactaacggt actgtcgaat atttagtagt aacttcagat tacaagcgta — 

20301 tgacttatcg accgaacggt acaaataaag tgtttgttaa aagaaaagaa gcgggttcat ggtctgagtg 

20371 gtcagaatta gctattaatg attacaacac accttttgaa actgttcaaa gtgcccaatc aaaagctaat 

20441 atggccgaaa gtaacgctaa attatacgca gatgacaagt ttaataaaag gtattcggtt atttttgatg 

20511 gaacagcaaa tggtgtgggc tctacattgt acttaaatga gagtttagac caatttattt tattaatttt 

20581 ctatgggact tttccaggtg gtgactttac agagtttggc agtccttttg gaggaggaaa gatttcactg 

20651 aatccctcaa atcttccaga tggtgatgga aatggtggag gtgtttatga gtttggatta actaaatcta 



WO 00/32825 



PCT/IB99/02040 



179 



20721 
20791 
20861 
20931 
21001 
21071 
21141 
21211 
21281 
21351 
21421 
21491 
21561 
21631 
21701 
21771 
21841 
21911 
21981 
22051 
22121 
22191 
22261 
22331 
22401 
22471 
22541 
22611 
22681 
22751 
22821 
22891 
22961 
23031 
23101 
23171 
23241 
23311 
23381 
23451 
23521 
23591 
23661 
23731 
23801 
23871 
23941 
24011 
24081 
24151 
24221 
24291 
24361 
24431 
24501 
24571 
24641 
24711 
24781 
24851 
24921 
24991 
25061 
25131 
25201 
25271 
25341 
25411 
25481 
25551 
25621 
25691 
25761 
25831 
25901 
25971 



gtcgcacatc 
cgcaaataga 
atgagataat 
tttctctcaa 
tcagaagaaa 
tcttacgaaa 
taagcaaaat 
actgaaaatg 
agtaaagaag 
gagaaaagta 
tggtgtaatg 
aagactatgt 
tagatagaaa 
aaatattaga 
ttgttaaaaa 
ttggtcgtgt 
tttggaaaaa 
gtaaatcaat 
tacttactgt 
tcaaaagcta 
gtaatgacac 
aaccaagcag 
agtgttacga 
taatattcca 
ttaccgcaaa 
ttgagagcgc 
cgttgcgcaa 
tttattagat 
ctgccaaaaa 
tggagcagta 
tatttaagac 
atacagcata 
tgacattgtt 
agtcaattca 
gaggtgtaac 
atctgaatta 
aaattaatag 
Ctaaaaacga 
agaaactggt 
agaattactg 
gatggattac 
taatagaata 
cttcggtact 
aaacaaacgt 
tcaactatat 
acgggttttt 
aaatatttaa 
atgattttta 
cttcgtttca 
aacagaagat 
aagtataaca 
acaaaaacac 
ccccaatgta 
ggttataaca 
aaacaattag 
caggagtttt 
tatcaatact 
tttaataatg 
tcctatttat 
taatataacc 
aacaacggta 
aaacctataa 
tttaccgtca 
ttatatttaa 
tggcgcccgg 
cgccataaaa 
gaatcttctc 
gtgacgataa 
agtttctttt 
gtcttttata 
ctaaacgaag 
tatctacacg 
ttctttctta 
caaaaatcag 
attcttctcc 
atctatagaa 



tttaactata 
gggacaatta 
tccatacgct 
gcttttagac 
aagatgactt 
aatggttgct 
gcactaatgg 
crtaaattaa 
acatagcgtg 
tccagaaaat 
cttggattta 
ttgaaaaatt 
tctcgaagaa 
gacatcaaga 
ctatttttgg 
ttctggttta 
aggagcaaac 
tctcagcgaa 
cgttgcttta 
aagaaatata 
ctacgaatat 
aaaaatggtt 
ttacgcaaat 
tttgataata 
agttggacat 
taatctaaac 
cctggttggg 
taaatttccc 
gcaagcagta 
ggaaacggaa 
atgccggtca 
cggtgttaat 
ctagaaacac 
atgcagatac 
acctcgtaac 
ggttttatca 
ccggtgcgat 
aaagaatccg 
tattacacag 
gtgtattacc 
ttatattgct 
agcagttttg 
tgcctattat 
ttttagtata 
cgtggtttta 
ttcgaaataa 
ttttattaaa 
tggtcaaaaa 
tgaatctaaa 
acaagtagcg 
aagacgcttt 
agatcatata 
gatttaataa 
taggtggtaa 
ttataataaa 
aggttacata 
cctgtattat 
ttgatttaaa 
tcttgtttta 
agaaagttta 
aaccagtatt 
ttcagctggt 
aacgatgaat 
tgaatgaata 
cttttcaaaa 
ttctcaccac 
tggttaactt 
atctttaggt 
tttattttgc 
ttaaagcgcc 
cgactttgat 
cttgataaga 
a&aaaagcgt 
cattcgatgg 
tatgccagca 
gtgactttat 



ccaaacgatg 
acaaaattat 
atcattggtg 
ctaaagcctt 
gcatcaacag 
agtatgcaga 
caaaacaact 
tcccaccaac 
gtatgtagat 
ctagagtcat 
ccaaacgaca 
cgacagaata 
ctaaggcgtg 
tgtggattct 
catttaaagg 
gtaagtgtaa 
aaatggatgc 
caaaggtatt 
tatactacgt 
aagctgaaaa 
gaacgacaca 
tgataattca 
atgtttttta 
aagcaaggat 
tgtcgttttc 
actttcacat 
gtcccgaaac 
agataaagta 
attaaaccta 
caaacgaacg 
tgaagtcgca 
gtaggtaata 
atttagacgc 
tattgataaa 
gat t tact aa 
ctaataaaaa 
tcatggtaag 
ccagtgccag 
ttgccaatgt 
taataacgca 
aatagtggac 
gtaagtttag 
ttaaaattaa 
taaattattt 
tgtttattat 
tagtaaaaaa 
agttaaaaag 
aagactatta 
gctgataaca 
ataagtgggg 
gattttaaaa 
aaagcaatga 
attatctacc 
ttttaatagt 
ataaaaagta 
ccatataaat 
tgattttttc 
aaatttgaat 
acagtgtttg 
tgaaattgga 
tatagttata 
agcgatttcg 
tgtatattaa 
ctaatctttt 
cttttgttta 
cattcaacgt 
atctccatct 
aactcataag 
aattagttat 
acacaggcgc 
atcatcatac 
cttactccat 
atgttccttg 
cgtttcgtct 
ccagttgcac 
tctgttcttc 



tctatttcga 
aggagtgaga 
gctttgaaga 
taaatattca 
attgacagtg 
aacaagttgt 
tgtgacactt 
attcgaagat 
atggaagtta 
aggttataat 
cgaacaagat 
gaagacagtc 
acaaagaaga 
aggattaata 
aggtgattac 
gtaatagtta 
aaaagtaata 
agcccgattc 
ataaagacaa 
caagtataga 
aatgatttag 
ttagggaagc 
tgatagcaac 
tgaaaaatac 
ccgtcaaagt 
cgtttggcca 
cgttacaaga 
agtgttggag 
aaaaaattat 
cgattttata 
ttatatggtg 
aaaaagatta 
agcaggagaa 
agtatacaag 
atgttaacgt 
tgatatggat 
cctatcggtg 
caggttatac 
taaaggtaat 
acaatcaaat 
aacgtcgtta 
tgcagtttga 
taaacagtta 
tgtgttcgta 
caatcaaaat 
acacatttgt 
gtttaatata 
gctgcaacat 
atattgagaa 
ggtcacacaa 
atgcaaggtt 
ggtggccttt 
taaaaataaa 
ggtccatcaa 
ggtgataaga 
atctaacaat 
tattgaaaca 
ttaattcagt 
tatttaatcc 
ttgtataagc 
aaagactttg 
atttactaga 
acatacactt 
ttcttagctt 
ttgggttact 
ctacacttgt 
attttttgtg 
tgaatggttg 
tttcattata 
tgttaatcac 
ttcggattta 
ctaatacaac 
ttttaacata 
tctttaaaaa 
cacatgcaat 
caattgttca 



cttaggaagt 
aaataatgca 
aggtattgat 
aatggggaaa 
aagaacaaaa 
tcaaagtaca 
aataaaaaat 
attaaaacat 
tagataaaga 
cttatggctt 
tggcgtttaa 
tgagaacgca 
agatgaaaaa 
gggacgattc 
catgcttaag 
agagtcagtg 
acaagataca 
cagtagacga 
tccaacatct 
aaagcaacag 
ggtaggtgtt 
agttcaatcc 
aggcgaaagg 
gggcaaataa 
atggtggcgg 
aaattggaat 
catgttcatt 
ataaagctaa 
gcttgtagcc 
cgtaaatata 
gctcaagtca 
tggcttatat 
agcgcaagtg 
atgttattaa 
atcagcagaa 
tggattaaga 
gtgtgatatc 
acccgataaa 
aacgtaaggg 
atgacggcgc 
tattgctaca 
taattgtata 
atttttacat 
ttgtgtgcta 
ataaattatt 
agatatttta 
aaaatgtaat 
tgtcgttagg 
tattggtgat 
aatattcagt 
ttatcaattc 
ccaatacaat 
atagattcag 
caggaggtaa 
tgactcaatt 
gataggttta 
tgtttgatat 
tgcttacagg 
tttaattgtt 
ttattagaca 
aaaacagaat 
agttgagcga 
gtagacctta 
tttctgataa 
acgagtagct 
aggcgttttt 
aaataaattc 
attaccacta 
aacttccttt 
aatacaactt 
gagataccaa 
gagtgcaatt 
ggttccattg 
atacttcttc 
atacgatact 
tttgcatagt 



caaagaggct 
aatattagtt 
actgaaaatt 
tagtttttaa 
cacagtcgct 
aagttatcga 
tagaagaggt 
ggtatcaatt 
ggaatatgca 
tttaatttga 
cgcgattaga 
agaaaaaatt 
aataaagaga 
taagtacatt 
ggaattttag 
cttcggcact 
tcgtattgat 
tgagaatata 
caagaaggta 
ggcaagcgcc 
gaccaatgtt 
tgatttgttt 
ttacaaggtt 
ttaaaaacta 
agctggacat 
ggtaaaggtt 
attacgatga 
aagcgttatt 
ggtcatggtt 
taacgccaaa 
atcacaagac 
tgggttaaat 
gtgggcatgt 
aaataactta 
ataaatataa 
aaaactatga 
tagtgaggtt 
aataatgtac 
acggctattc 
atattgtatc 
ggagaggtag 
tgatgaatct 
gaatatatta 
tgattaaaaa 
tataatttgt 
aactcggtaa 
aaaatttata 
aataatcact 
ggcgctgagg 
ttgattttgt 
aaagactact 
attggtctca 
taaatgttag 
tggttcattt 
tctaggggcg 
gttagtgaaa 
ggttttatag 
tctaaaagca 
aaatttatta 
aaagagacaa 
cattgaagag 
caagatttca 
aacaacaaat 
agtgcttttt 
tcttgttttt 
ttatttagta 
caagtattta 
gttaaaactt 
caaacactgc 
tgcccattac 
attaatatag 
gtaccatctt 
aatcaccatt 
atgcaatatg 
agtttagact 
taagtacgtt 



ctggtgcgaa 

aacaagcgta 

taccagaaaa 

cgaagattat 

tctgatgaca 

tgcaagttaa 

taaaggagag 

gaaagaatat 

attattacag 

ataaagtggg 

agaaaatgat 

tatgacaagt 

aaaatgctaa 

tgttatagcc 

gat at age tt 

ggctttttat 

cttagcatta 

tcatcaataa 

aatgggcaaa 

aattaaagaa 

gataacaaaa 

tatggatttc 

tataegctta 

tgatagcttt 

gttgaaattg 

ggacaaatgg 

cccaatgtat 

aagcaagcaa 

ataacgatcc 

tategctaag 

atgtatcaag 

cacaggggta 

tattatctca 

ggacaaataa 

attatcgett 

cttgtattct 

aaaacaccag 

cgtataaaaa 

aactaattca 

aatggctata 

acaaggcagg 

taggcaggta 

aattttaaaa 

gttgttatgg 

ttggtaatga 

atcttttaat 

aagaaaggaa 

ectattgeta 

tagtcaaaag 

taaagataaa 

tattacaatt 

aaacaaatga 

tcaaacatta 

aattattcaa 

cttcttctta 

aaaacaaggt 

ttttataatt 

aatattttgt 

tctggttaat 

gttgtttaat 

ggtgaactta 

aagtatctga 

taaattggat 

aatttttege 

tgtttttatc 

aagtcataat 

cgegcattat 

catatactat 

tgaaatagac 

tttaatatta_ 

tettegcata 

taatagaatc 

aactaaaata 

tcatcatata 

ctttatatcc 

ttcttggcgg 
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26041 ggaggtgtga gtttgttgta tatggaagtg atgtcgtcat cgtctttgta tgtagtattt gattcactat 

26111 acaaatcatt aatcttcaca tcgaagtact cagccaaaat tttggcagtt gataatcgag gttcttcctt 

26181 ttcactttcc cattttgata tcttgccttt cgttaatttc attaagtcgg gatatttatt atcaagatca 

26251 gttgctaatt gttccatagt catattttta tttttttctt agcttcttta aaccttcacc aatacccata 

26321 cgaaaccctc cttatataag ataatttcat tataaaagtt tcgaaaacga aacgcaagga aaatattatt 

26391 gcaaaagttg ttgacatcga aacttttatg atgtattctt aaatcaagtt gttacaaacg aaacaaaagg 

26461 agggggttca atgacaacta gtgtagcaga taaaccatac ttaaaaataa aaagcttgat tgcacttaaa 

26531 ggaactaacc aaaaagaagt tgctaaagca atcggaatga gtagaagttt atcgagtata aagataaatc 

26601 gaattaatgg cagagatttt acaacttcag aagctaaaaa attagcagat catttaaatg ttaaagttga 

26671 tgattttttt taaactttaa gtttcgaaag tgacaactaa ataaaaataa ggaggacact atggaacaaa 

26741 taacgttaac caaagaagag ttgaaagaaa ttacagcgaa agaagttaga aatgctataa aaggcgagaa 

26B11 accaatcagc tcaggtgcaa ttttcagtaa agtaagaatc aataatgacg atttagaaga aatcaataaa 

26881 aaactcaatt tcgcaaaaga tttgtcgcta ggaagattga ggaagctcaa tcatccgatt ccgctaaaaa 

26951 agtatcagca tggcttcgaa tcaattcatc aaaaagctta tgtacaagat gttcatgacc atattagaaa 

27021 attaacatta tcaatttttg gagtgacact taattcagac ttgagtgaaa gtgaatacaa cctagcagca 

27091 aaaatttata gagatatcaa aaactattat ttatatatct atgaaaagag agtttcagaa ttaactatcg 

27161 atgatttcga atgaaggagg aactacaaat gaaactacta agaaggctat tcaataaaaa acacgaaaac 

27231 ttaattgacg tgtggcatgg aaatcaatgg ttaaaagtga aagaaagcaa attaaaaaaa tataaagtgg 

27301 tctcggatag agaaggtaag aaatatctaa ttaaataagc gcacttaatt agtgcaagta atcaagtgcg 

27371 ctattgcctt acaatcctaa atcttttctg cttttttctt cttcttgtaa tcccaataac acagaagagt 

27441 aaatgctgaa atagtcacga gcaacgctat ctttagcgaa tgcaattacg tcatcaccga cttcttgcca 

27511 ctcgttatga atcttatgtc tatctagagc tctaggtaat agcgagattg taatatcgtg agcaattttc 

27581 tctaaatcca taaatttcac ctccttccac tgggagataa ctaaattata taacaaaaca acttaaagga 

27651 ggaacgacaa atgcaagctc aaaacaaaaa agtcatctat tactactatg acgaagaagg taataggcga 

27721 ccattagata ttcaaattaa tgacggatat gaactgatgg tccgatctca tttcatcaac aacaccattg 

27791 aagaaatacc atacgtaaat aataacttat atgccttggt tgatggttat gaatttaagt tagattgaat 

27861 ttttgagaaa gatattgaaa agctaatttc cccataagat taagagacat actggatgtt ttgttaacga 

27931 ctcttttaac ttcgttccaa gttttattgt ctctaatatt atcgagaaac tcatggccag accaagtgat 

28001 gtcatcaata atccaagaaa cgaccctgcc ttcgatgaat ttcagatcgc aacaaataaa tttagcttct 

28071 tctaatttta aaagtgagta cattactgtt tcaaaatcat atttatcaaa aataatatta tcgttgaaat 

28141 tatgtcgagt aagtggttca cctattttct tattagattc tatttctaag agcaagagtc taacgcaatc 

28211 gtgattaagt ttcatcctat cacctccata acaggagtat agcagaaagg atcataaaca tcttaaaagg 

28281 aggaataaca aatgaacatt caagaagcaa ctaagacagc tacaaaaaat cttgtctcta tgacacggaa 

28351 agattggaaa gaaagtcatc gaactaagat attaccaaca aatgatagtt ttttacaatg catcatttca 

28421 aatagcgatg ggacaaacct tatcagatat tggcaacctt cagccgatga cctcatggca aatgattggg 

28491 aagttataaa cccaactaga gaccaggaat tattgaagca attttagaaa tgctatcaat gatacttttt 

28561 aaattgtttt taaactcatt ttcaaagtaa acaacagtct tgtctgaaat tgttacatga taaatagtgt 

28631 tactagcata cacgccgttt aggaacccag agtttttaag tttatttaaa tcgtatttta catcttcgaa 

28701 atgtagtttt tgaaaatact ttgtatgtat atctttagca cttccaaaat tattgcaggt taatttaacc 

28771 gaacctaact ttacacattc taaataatct ttgtagagta cggacaagat atattgttgg tctttagtaa 

28841 gtgtatcaaa ctcatcagat atcaagggca tgttatcacc tccttaggtt gataacaaca ttatacacga 

28911 aaggagcata aacaaatgaa cacaagatca gaaggattgc gtataggcgt cccacaagtt tctagcaaag 

2 8 981 ctgatgcttc ttcatcctat ttaacggaaa aggaacgtaa cttaggagcg gaaatattag agcttattaa 

29051 aaaaagtgat tacagctact tagaaataaa caaagttttc tatgcattag atagagaact tcaatacagg 

29121 gcgaataata acaaacttta acatttatct aaaggagtga tagagatgcc aaaaatcata ataccaccaa 

29191 caccagaaaa cacatatcga ggcgaagaaa aatttgtgaa aaagttatac gcaacaccta cacaaatcca 

29261 tcaattgttt ggagtatgta gaagtacagt atacaactgg ttgaaatatt accgtgaaga taatttaggt 

29331 gtagaaaatt tatacattga ttattcagca acgggaacat tgattaatat ttctaaatta gaagagtatt 

29401 tgatcagaaa gcataaaaaa tggtattagg aggattatca aatgagcgac acatataaaa gctacctatt 

29471 agcagtgttg tgcttcacgg tcttagcgat tgtactcatg ccgtttctat acttcactac agcatggtca 

29541 attgcgggat tcgcaagtat cgcaacattc at at tt tat a aggaatactt ttatgaagaa taaagaaact 

29611 gctacttgtt ggagcaagta acagtgcaag atgagcaatt gtcttaaata attatataag gagttattaa 

29681 tatgacctta caacaaaaaa tactatcaca ttttgcaaca tatgacaatt tcaattctga tgatgttgtt 

29751 gaagtttttg ggatatctaa aacacatgca aaatccacac tttcaagact taagaaaaaa ggaaagattg 

29821 aattggaaag ttggggtatc tggcgtgttg ttgaaccgca gttacattta actgttgtag aacgtaagaa 

29891 agagatatta gaagaacaat tcgagttatt ggcaagatta aacgaacaaa gtgatgaccc tagagaaata 

29961 gaagaacgca tcaagttaat gattcgttta gccaaccaat tttaaggagg agttaatcaa tggcaatatt 

30031 agaaggtatt tttgaagaat taaaactatt aaataagaat ttacgtgtgc taaatactga actatcaact 

30101 gtagattcat caattgtaca agagaaagtt aaagaagcac caatgccaaa agatgaaaca gctcaactgg 

30171 aatcagttga agaagttaag gaaacttctg ctgatttaac taaagattat gttttatcag taggaaaaga 

30241 gttccttaaa aaagcagata cttctgataa gaaagaattt agaaataaac ttaacgaact tggtgcggat 

30311 aagctatcta ctatcaaaga agagcattat gaaaaaattg ttgattttat gaatgcgaga ataaatgcat 

30381 gaagctagat cactcaaata gagctcatgc aaagcttagt gcaagtggag caaaacaatg gctaaactgt 

30451 ccaccgagta ttaaggcaag tgaaggtatt gcagataaaa gttcagtttt tgctgaagaa ggtacattcg 

30521 ctcatgagtt aagtgagtta tatttcagtc ttaaatatga aggcctaaca cagtttgagt ttaataaagc 

30591 ttttcaaaat tataagcgaa atcaatatta cagtgaagag ttgcgcgaat atgttgaaga gtacgtagct 

30661 aatgtagaag aaaaatataa cgaagctttg agtagagatg acgatgtaat agctttattt gaaacaaaat 

30731 tggatttagg taaatacgtc cctgaatctt ttggtactgg tgatgtcatt atattttcag gtggtgtact 

30801 tgaaattatt gaccttaaat acggtaaagg cattgaagtt t cage tat ag ataatcctca acttagatta 

30871 tatggcttgg gcgcatatga actgettagt ttaatgtatg acattcatac agttcgcatg actatcatac ^ 

30941 aaccacgaat agataacttt tctactgaag agttaccaat atcaagatta cttcaatggg gaaccgattt 

31011 tgttaaacca ttagecagae ttgcttataa cggtgaaggt gagtttaaag caggtagtca ttgtagattc 

310 Bl tgtaagataa agcattcatg tagaacacgt gcagaataca tgcaaaatgt gectcaaaag ccaccacatt 

31151 tgttgagtga tgaagagatt gcagaacttt tatataaact gcctgacatc aaaaaatggg ctgatgaagt 

31221 agaaaaatat gcact agate aagegaaaga aaatgataaa aactattctg gttggaagct tgtagaaggt 

31291 cgctcgcgaa gaatgataac tgatacaaat geaaegcttg aaaagttagt tgaagcaggt tataaacctg 
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ggcattgaag 
tttggaaggc 
tagattttat 
gctttagttg 
ggtcaaagtt 
tgcgatttct 
gtagaaatac 
caaaaggatt 
catgcaacat 
tacaatcaga 
aatataacgt 
gtgcgtatca 
ttagatatag 
agatgaagtt 
gcgattttga 
catggcttta 
aatgttcaag 
aagacaatgt 
tacggcacag 
ttaaagaatg 
gatgaccatg 
caggataacg 
caggctggca 
tttggaatac 
aaaaataatg 
tattatacaa 
gcgattttat 
tgataggttg 
ataaacgaac 
ctatgacgtt 
aagcaaagtg 
tagctaagtt 
cgttgaaaaa 
aataaaattt 
agcgcgtata 
cgtttgatga 
tgaagcagca 
attacaagtg 
gacatatttt 
taagcattcc 
gattgaaaca 
acgttagaga 
tacctgtaac 
attggttatc 



agaaaccaag 
gaaggcttta 
agcaatctgc 
aagtattaaa 
cagtatgcaa 
ataaaagcca 
ttcctgcaaa 
cgcttatttt 
gattctggaa 
gtaataaggg 
aagtgcagca 
ggtttttagc 
tgaaacatat 
atcttaatta 
agcctttcca 
caatgctaat 
tgcacaatgg 
tacaaaacca 
agttaatgga 
tgtattcgag 
aacaagcata 
aggagctaat 
gaaaatccta 
tacaaaagaa 
attgcaaatg 
gtaagaggtc 
atttaacaaa 
tgacgattta 
actgctgaag 
atgcaaaaga 
aatgtttaat 
ttagctttag 
aaaacgagtt 
ttgccaagag 
atgaaaaaag 
gtgaaaatag 
aaaaacgtat 
atagcaaggc 
ctagaggttc 
gaatttgaat 
caagcttata 
ttgataaaga 
tataaatatt 
aaggaaaact 
atacaaacat 
gaatatatgc 
acctatacca 



ttacttagca 
tagaaaagcc 
tgaagatgat 
caaaactaaa 



tgatggaacg 
aaatggagtg 
taagcaacca 
tatccacctc 
gtcatgagga 
acagatagcg 
aacgagtggc 
aggcgtgaaa 
gaggaacaga 
aaaatagacg 
cagacaaaaa 
attagttaat 
tttgaagtgt 
gtttaaaaaa 
aatttggaaa 
aatactggtt 
aagagagtgc 
gatttaataa 
ttatggagga 
acggcaatga 
aatacttgag 
ggaattatga 
aaactattat 
cagctaaaac 
agagggctat 
ggcaagatag 
gtgaggatat 
agataaattt 
gtgcctatat 



gaagggcaag 
ctgaacaagc 
tctgaaactt 
attaacgcat 
ctattgtaag 
tatcgcagtt 
gaagatgatc 
cccactttaa 
agcagtaacg 
tagcttattc 
cgctgattat 
ttcgaaagaa 
ttaattcaat 
aaaagataaa 
ggaagaacaa 
atgtagaagt 
ttgggttttt 
gtgctcgata 
atagtcctac 
aacggttcag 
tctaaaacca 
tgtttcaatt 
gcattatatt 
gatttattac 
aaggtaatga 
acaatggcgt 
gtaccggtag 
gctatcaagg 
acaaggttta 
gctgcaatta 
gttttctaat 
ttggggtagt 
ggtgggaagt 
ttgaagcatt 
aaatggactt 
agtgacgggt 
tcaatgcttc 
aaaagaaaat 
aaggttttag 
aatgacattg 
aaaatgcgcg 
agagtcaaat 
acgaaatgtg 
ccacaaaaac 



aagcataatg 
gcgcactaca 
aactagcatt 
tttagcaaag 
catgtaaaga 
acatatccat 
aagacatttg 
agcaactaac 
aattaattaa 



tgatgaaaac 
gctactttta 
acgagattca 
cgacagaatt 
gaaggagagt 
tgatgaagca 
aacaaagagc 
acgaatttgg 
agacgcaaat 
cacagaggga 
ggtttaccta 
cgggtcaagt 
caaaagacgc 
tgtcgtactt 
gtcggcttat 
acgcgggata 
gagtaacttt 
ttatgtatga 
ggacacctga 



1S1 

ttacgaatct 
acaaggtaaa 
tttgacaaac 
gtgattacag 
aagcaaagta 
tatagaagcc 
ccattacgtg 
caagcaaaca 
tggtgactat 
ggattgaaca 
tcgatgaatt 
ttttaaagaa 
atatttcgaa 
aatagatggt 
gagacgttta 
cttgtcttgc 
gcgtattggc 
gcaggtaaaa 
gaaatttgcc 
agaaatgaca 
gaccaacata 
agcagagtaa 
acagttattg 
gattacttaa 
gtgtgaaaaa 
ctacggtgcc 
tcagatactg 
tcaatgttca 
actagcagta 
ttagatgtgt 
aaagcataac 
tggcgctgga 
gttgatagtt 
atactgtaaa 
gattgaactg 
caagttgttg 
tagtcgagaa 
aggttttaaa 
aaggaaatcg 
ttacttctcc 
tgttgacatt 
tcggcggact 
atttagaggt 
agagaagttt 
gcgatgaatc 
aaaagatgaa 
agagcggaat 
attcacgtga 
agcgtaatca 
catacggcga 
cgcaataggt 
gcgaagtttt 
atacttaaac 
gtagtgaatg 
atacagcgga 
tttattttaa 
gtgggcgcga 
tttatttact 
gtgtcaaagt 
agaaggagtc 
tttcttgcta 
tgattaaata 
agcaagtttt 
tggagaagaa 
caaagatggt 
gaacaaccgc 
ttattaaaag 
atgctatgca 
tgtctataaa 
tagaatgcca 
gagccacaag 
taactagccg 
ccatggtaat 
ggtcggagtc 
atggtagtta 
actaaagcaa 



agaaaaatca 
ttaacactcg 
cataaaaact 
gaaaagtaag 
ttcaatcagt 
gctaaagaag 
atggagatac 
agcacctggt 
attagagctt 
acattcaact 
agacactgat 
attgaggtgt 
atgtggtgtc 
ggaccgacta 
aaactgctct 
taaacactct 
ttacctgctt 
atttaattcg 
tgaacatgat 
attgctaata 
taaacgacag 
agaagaattg 
gcttggttaa 
aagtagcaac 
atacaacaaa 
ggtactggaa 
aattagaaat 
tcctcaagac 
agtgattttt 
tcaacacaca 
taaaggcgac 
gctttaaaag 
ggcgtaacgc 
atcccgaaag 
cctagtggaa 
aatttatggg 
tattgttcaa 
atagttggcc 
aaactatcat 
gttttatatg 
agaattccta 
atttatttaa 
ggaatgatgg 
cagaaaaata 
attgtgtgga 
gaaaaagaga 
atgaagaaga 
tccgtactgg 
gtaacagaaa 
cattgaaatt 
aatgcaataa 
acgtccaaag 
caatttttcg 
gcacttatta 
agagctcgaa 
ggagatagaa 
gaaaatccgg 
tcggtgtttt 
tgaagaagaa 
tataaatctg 
aagcattcta 
atggaacacg 
tgaagcaacg 
agcaagtgca 
gaaagagtta 
gaacaaatcg 
agattgagga 
agatgcaatc 
tatgaggagg 
gaacgaaatc 
aaaaggcagt 
tagtggtgta 
ttagggatta 
cttctggtat 
tgtcataaat 
gtggaggaat 



atcggcaaaa 
ctaccgagtc 
aaaaaggacg 
agcatcatat 
ttaatcattc 
aaggaaaagt 
tgaaagagaa 
attattgacc 
caatcaattt 



tgtagaaaaa 
gatgaggatt 
caagaatttg 
tataaataca 
gtgcgattga 
atttgaccct 
aataaacaga 
cgcttgataa 
ttatttctct 
cttgaaaaat 
aaattaaaga 
aggtattaag 
cttaaacaag 
aggatgaaca 
aggaaaagct 
atgcatgaca 
gatgggcagg 
agcaagagat 
ttattaagtc 
ctgcaataga 
cggaaagata 
cctctcagac 
caatgggtgc 
aaatcctaac 
acgcatcata 
gagctttagc 
gttagatctt 
gcaactgcaa 
atgtccatga 
gaataagcct 
aaggattagg 
cagaagtcga 
taatccaggt 
ctagaagaaa 
tcacatatct 
agaaaagact 
gagaaaaaat 
aagaaagaga 
ttcgatgtca 
agtagatatg 
atagatttta 
aatacttgtc 
agcttttgac 
gatctaagag 
ctttcacggg 
acatatataa 
atgatgaaaa 
agctatcatt 
taaaaattgt 
gtaaccgaag 
catcatatga 
catcttaaac 
gttcaaaaga 
tgatgagctt 
tgggataggt 
aatttggaat 
aacaatcagt 
cgtgtataag 
aaagaagata 
agcaggaaaa 
ataagacgga 
gatcaaaaca 
agtagtaaaa 
atatcaagaa 
agatggaaaa 
aaaggcgaca 
tcgagagtgt 



aagcattttc 
tgataaacga 
gtatataaac 
gcacatattt 
ctaaatcaga 
tagtaagttt 
gatgatgtga 
aaaacaaaat 
atttccattc 
ggcgaacctc 
tcttataagt 
aaatttatga 
cagaagctga 
catgactaaa 
gctgtaaaaa 
tgccacctga 
agtcggagaa 
ataccttgta 
ggcaacaatt 
ctttccagta 
ctttctaaat 
ctaaacatat 
aggattagat 
aaaaaaatgc 
tgatgtgcag 
tagaggtgta. 
cttattaaag 
aattagttag 
ggcaagagtc 
tatgaagcat 
aaaaaggaaa 
attggaaatg 
atagttaatt 
cacatggact 
ttatccaaaa 
aaccgtaaat 
gggatttact 
tgaagtaatt 
gttgattggg 
agtgtgattg 
aagtgttaat 
gaactattaa 
agttataaga 
ccagaacttc 
caaaatctaa 
cagaaaaaaa 
agattgagac 
cttataacca 
aacgaagcgc 
tcgaacaggt 
tagagcacct 
ttgtgggagt 
atatctgtat 
catatcgtac 
agcaacatgg 
tcaaagttga 
tggcagaaaa 
tttaaaataa 
aaactaagtt 
gaatgctagt 
gacgacctaa 
atattacgaa 
attggagata 
attgcaagag 
ggaattaaac 
tattagtgct 
aaagcgcaag 
ttggtcttga 
cgactaacat 
tgcaggttat 
gatgtagctg 
cgcatttagt 
tgataatgaa 
tacaccctac 
aactagctca 
ttcagaacgt 
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ggagcaaaag 

tgagtgacat 

tctcaagagt 

tctacatttg 

agctcaaaca 

ggacgtaagc 

aagaagaagc 

tgaaaatata 

agccaggcaa 

tgactttgaa 

tgtgtttagc 

ttattgacct 

atcttttcaa 

aattagaaaa 

ccgaacatgt 

ctgctagtcg 

acaaacagtt 

aaagatgttg 

gttcaatgtt 

ttttgcatat 

ttaaaacgaa 

attacttcga 

atttttcttt 

actgacacat 

gcgacccaga 

aacttttatt 

ggtggattgg 

tgcttgtgaa 

tactccggct 

aagcaattaa 

catggtctga 

attgcgtaat 

gggaaaatgc 

gttatatcga 

gcaaaatgcc 

aagttattta 

ctgctggaat 

aggtgtaggt 

actggtaagg 

gaaaagctga 

acattatatc 

gatgaaactg 

aactaaccaa 

gttccttaac 

acaggtatta 

gaagacgatt 

ggtctgtgcg 

attagaaaga 

aaattcgatt 

ataagaaata 

aacacggtgg 

gatacaacac 

ctataaagtt 

aagtttggca 

gaatcgacat 

cacctggaac 

gaagcaagaa 

acagtgtatg 

attgatttca 

tgtttttaga 

cactaaaaaa 

tggaaccatt 

acacagaggc 

atggccattt 

tctattaaaa 

tacaggattt 

tcgagaaagg 

tctgaagaaa 

tgcctgacag 

agaaaaaaac 

caaaaactac 

agaagttaga 

caaacatgat 

gaacgttgga 

tacaacaagg 

tgcaagatta 



gcttcggaag 
gttagaaata 
aaaaagacta 
ctgataaaca 
agtccaaggg 
atacacacat 
gaaagagaag 
agggagtgtg 
gtatgtaact 
aaaatcagag 
aatagcactg 
tgttaagtag 
gcagctagta 
agatttaaaa 
tagacaaagt 
tttatctaca 
agaactaaag 
gcggatttgt 
aacacttgat 
tgtttatatt 
atgtaaatgc 
tgatacaact 
acctatgaag 
tagaatggcc 
agaaaagccg 
cctgatttat 
tgttatacga 
cagttttgat 
aatcgactac 
ttaacgacaa 
gacgttagaa 
gatccaaatt 
catggaataa 
aaagattcat 
tatcatccag 
tcaaatactt 
cgctcgagta 
aaatctgctt 
aagcatatga 
agttgaagct 
gaagattttc 
gtggaagacg 
agaagagatc 
cctgaactag 
ttgatgaata 
ttatcaaggt 
cttgaagtgt 
tttctaacgt 
tggaaaagat 
ttgaataaat 
tgtaaaaagt 
ctctttccct 
taaaaagtag 
agaattgatg 
tagaaaaata 
aagaggtgta 
aagggaaagt 
tgttatggaa 
aaccacatag 
tatggggcta 
atgttagcca 
taaatcatct 
tgatatctat 
gacatggtcg 
agaaattacc 
atgggctcaa 
tactttaaac 
agatatatga 
agttgatact 
tatattttag 
ttcaactatc 
taagctagag 
aaagaaagaa 
atagtggaga 
tgggcacatt 
tatagacaag 



cagcggagcg 

tttttcatag 

tacacacaaa 

tcaaaagacg 

agctatgagg 

aactaaagct 

tacgaggcac 

ggaaacgact 

gagcaagtat 

ccgaagtttc 

gaggtgttgt 

catttctcat 

aaaaataatg 

gctaaagaga 

cactcaaata 

cattggacta 

aagatttaac 

cggtggttat 

atcgattatg 

caacacataa 

agacgagtat 

tatcaaccac 

atttaccttt 

aacgtcttca 

ggaattgttg 

acgaaaaaca 

aaataacaag 

ttagtacgca 

ctagttataa 

aatgtctgat 

attacttcga 

taaaaggaaa 

taattttaaa 

gacatacacc 

taagagatta 

aggtgttgaa 

atggagccag 

tgctaaaaaa 

ggcattacaa 

attaagcatt 

caaggcaatg 

tttttggcca 

gaccaaatct 

aagaagaaat 

tcttaacacg 

gatgttgata 

ttgttgaatg 

cttaagacaa 

tatggtgtac 

atacattttt 

aatcgtaggt 

tctcgctgta 

tgttagggag 

caacatcgga 

tttagtgaaa 

ccagatagaa 

tacatccttt 

taaagaacaa 

ctatcaaaag 

gggaaaacag 

tagcacctaa 

gaaagtgtct 

gtaaccaata 

taattgatga 

actcattaat 

gtttatttga 

caacacatca 

acgaatagaa 

aaacaaacag 

aatcggaaga 

taacggtgca 

gaaattatag 

tacttcaaag 

cattaagctg 

attgtttggt 

gacaaaatca 
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taaagacata 

ggtttggtgt 

cctatatgaa 

catatcttaa 

aatgacacaa 

aagagcaatc 

aagttaaaag 

aaacaaatac 

atattatgat 

atggtaatag 

aaatatgtgg 

aagatgaaaa 

gtgttgaagg 

ggcgttggct 

gaaacaatta 

atcacaatat 

tgagtacaat 

ttaaaagaag 

ctgctcaaga 

gcatagagag 

gaagctattg 

ataggttaat 

gttagaccca 

agggaagaga 

gtgcattttg 

ttctactaac 

tttgcctatt 

tacacttata 

agcaatgcag 

gcaatgcagg 

aaggtacttt 

aatagcattt 

atacgtcaat 

attcaggcaa 

tctaaataaa 

gacactgaag 

gatgtaaatt 

aataggtggt 

ggcgtttggt 

tcatatctaa 

tattttcatt 

atgactgtaa 

gggcagaagc 

gcgttcaatc 

ccaatcccaa 

tgttaccaac 

ttttggtaaa 

ttagacaatt 

agatagcgta 

agatgttgta 

gttgtatcat 

aggttcaacc 

taaaggggta 

acacaaatat 

gagataacaa 

ttattattat 

acaaaaatat 

gtaaatactt 

tatgcaatag 

tatcaacact 

acaagttgct 

ttagtcttag 

aagaaaatac 

actgtctaca 

agatttatag 

tagacagagg 

agttagcgaa 

gatatatgtt 

tagtcttatc 

agaaggaaca 

gtttatacag 

aggagtctca 

gtttaaggaa 

cttatagcac 

ttggacttac 

tacgactatt 



ttagatcgag 

ttatctactt 

atgttgttga 

tagcattttt 

tacctagtca 

aaaggtttac 

aaatgcagtt 

taagattatt 

gacggctaat 

ctattatcat 

attgtcattt 

ccatagaagc 

tatagaagat 

tctctgttct 

aatatgatcg 

ggcttggtct 

aaaatgtcta 

gcaaacgacg 

tatgactgac 

ataagtccaa 

ggcgtaaagt 

gtattggcct 

gataaaatat 

gtaagactaa 

tagagcctat 

cgttatacct 

ctcatcataa 

tggtgctcaa 

caaagagcgc 

atttcgatga 

caaagctagt 

aatgaattta 

ggcaagacgg 

aacaaaagat 

atatcgtggg 

tgaatagaac 

tgactatatg 

gcatggtttt 

taatggaaat 

acaagttgac 

ggtacaacta 

atccagagag 

taaatactat 

caaagtaaac 

gcaattggga 

aggaaatgta 

gataagggag 

ggtctgtata 

tgtaagagat 

tcaaatgttg 

ttttggtgat 

ctgtttgttt 

taggggtaac 

aaattttgta 

agttaaatgg 

gccagaagga 

gtgcatcggc 

ttataagaat 

ataaagtgat 

tacagcattt 

aaagatacat 

gaacacctaa 

taaatggtta 

tttaaaagtc 

gattaacagg 

cgaaagactt 

catgttttta 

taagcatgaa 

tgaaaaagaa 

gttgtagctc 

atgatgaaga 

aggccaacca 

gcaaccacat 

atccagcaag 

atggtcattg 

attcatcaca 



tcaaggaggt 

tgtcgcatag 

ttgctactat 

agtaatgttt 

caacatttaa 

agttgttgat 

attaaattag 

attcttacta 

gatgatgcag 

ttttgaatta 

caattgtttt 

attggagtat 

tatgaaaatg 

atttaaaata 

tgatgtttca 

gactttatgc 

agtctgaaca 

tgctggtcaa 

atattatcta 

gactgcgttt 

cgcagatatc 

tcaactagta 

taaatgaata 

aagattagca 

acgatagaag 

atcatgaagg 

tacggatccc 

gatgaagacg 

aaaatgatga 

aatagtaaat 

atcccaaata 

caaaacaaat 

tgatgatagc 

gccattataa 

atggacataa 

aactaccaaa 

cttacacttt 

ctgacagttt 

ggcagaactt 

cggtttcgtg 

ataaagttga 

agttgaagtg 

tatgaacaag 

atactgagga 

agacttaact 

gattacattg 

atagtagagg 

tgaaggcaat 

gaaagtttag 

catcattttt 

gcaacattga 

ccaatgttgc 

cctctaacag 

tacaaggtga 

attatgttta 

aaaacatatt 

aatttgaaaa 

ggtaggtgga 

tgataatgag 

agtgaattgc 

gggttgatga 

agaaagaaat 

tgtgatcaat 

ctaagagtca 

aacacctagt 

gagtcttcat 

actgggagct 

agcgaaagat 

agaaaagtat 

agaatggggc 

tgtaagactt 

atattattgt 

tagaggattc 

tgcagggcat 

gaattatacc 

tcatgaccga 



tttggggaag 

gtattatttt 

ctttgtgaca 

tttatgagta 

agattcaaca 

gcggagagta 

ggcagttgtt 

gcgatgtatg 

aggcgccgag 

attatattaa 

atctatattt 

atgaatgctt 

aagttgaacg 

atgaaaggag 

tattcttatg 

agaagctagc 

agccgatata 

gtcatgaatc 

tgttttatga 

agtgattcct 

gttggcatgg 

acgatgcgga 

tgttgattgg 

gataagcaag 

aagctataga 

ttcaactgca 

gtaagcggta 

ctaaaacaga 

agttgttaaa 

agcgatgatg 

tagaaattat 

tgaatgctta 

agtttaagaa 

gcgtagcaat 

acgtcttgaa 

aaggcattga 

atggtcctca 

agtttctgtt 

gcagctacaa 

ttgcttatgg 

tttcttaaga 

aactggtcta 

gagaagagtt 

atctccatat 

atctttgaaa 

aaagagacaa 

atctatggaa 

aaaagtggga 

aggatttaat 

tgagtgatgc 

tgcaacaaat 

atcaaattca 

ctatttttaa 

ataaatgaaa 

aaatgggtcg 

ttgtagaaat 

cagagatcat 

acatttggcg 

aaatacggtt 

agttgttaga 

agttgataag 

gatgcattaa 

ataaaaaaga 

aaggtttaaa 

ccaaatagtt 

tcagtcgtta 

aagagacgga 

tatctggata 

atgaagaatt 

atcattaagt^ 

atacatgata~~ 

tttataactt 

aaactataaa 

ggattaaact 

aacaagcaaa 

taacacaata 
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42001 gatcaaagag 

42071 caagaatagc 

42141 atttaaatat 

42211 gagatactca 

42281 ttagaacaac 

42351 tgaagcagtt 

42421 aataaagata 

42491 caatacgaaa 

42561 gcaaaaggcc 

42631 cgacataaat 

42701 tgaccaagca 

42771 agcatggaag 

42841 gatattataa 

42911 acttagataa 

42981 taatcttaag 

43051 tgcccatcgg 



tacataaagc tttacaaaat 
taagcataag taatggaggt 
attgaatcag aaatatataa 
acccaacgaa agaactagac 
tgagttaatg gcgacaaggt 
gaaagtgagt acttaaagtt 
agaagctaaa gatagaacaa 
gaactttgtt aaagcgatag 
tacaaatctg tagtaatatg 
acatgaggca catcgctaag 
taataacatt tataagcatg 
aagttaagag agatagcatc 
cagatgcaaa gattgtgcat 
tctaatgtca gtttgttata 
aaaattagag ttctaaaaat 
cttaaaatgt tttttcgccg 



1S3 

aaagaaccaa cgcaagaaga 
ataagatggg aaaggcgtca 
tttaaatgag aacaagaaag 
accaacattg tgtatggacc 
tattgaccaa taagatgtta 
acctgaagat cataagaaag 
ataggggatg cttgtcacat 
cgtatcatgc aggtatcaaa 
at age at egg aaagatgtat 
cggtgtgtct tttgttatgc 
gtcgtaagtc atatcaatac 
agatagagat aattatcttt 
cacattattt atgttgatga 
gctgtcataa caaaattcat 
ttaaataaaa aaattattta 
ggtaceggag aggee 



attgatgaaa gctattaaag 
tatgatatta agecaggaac 
agataaatag attgagaatg 
gttacaaaaa ggagagccag 
cgtaacttag aagagatggt 
taataaggtt aaagtattgg 
geategcaat acagttacta 
taacattgtg caaagattgt 
aaagttatct gaaagttata 
aatcaaagag gtgtaagaga 
gattggttct atcattcaaa 
gtcaaatgtg ettacgegaa 
agattttaac aaagctttag 
gcaaatgata atgacaaaag 
aataaaattt tatgcccccc 
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Bacteriophage 3A ORFs list 



SID 


LAN 


FRA 


POS 


a. a. 


RBS sequence 


STA 


STO 


100379 


3AORF001 


1 


8515. .13468 


1657 


acaggtacggatttaagaaaacttt 


ttg 


taa 


100380 


3AORF002 


2 


37667. .40114 


815 


tttaaaataatgaaaggagccgaac 


atg 


taa 


100381 


3AORF003 


1 


32188. .34149 


653 


ttaaagaaattgaggtgtcaagaat 


ttg 


tag 


100382 


3AORF004 


3 


17457. .19370 


637 


gctattttatt agaaaggaagg t gc 


att 


taa 


100383 


3AORF005 


1 


334 . .2034 


566 


agaaaaaagatagttcaagaagaag 


gtg 


taa 


100384 


3AORF006 


1 


15571. .17154 


527 


cttttatttataggtaggtgattta 


atg 


taa 


100385 


3AORF007 


2 


19337. .20836 


499 


atgatagtaaaacaagttcagggcc 


atg 


taa 


100386 


3AORF008 


3 


22176. .23630 


484 


aatgatttagggtaggtgttgacca 


atg 


tga 


100387 


3AORF009 


1 


40726. .42093 


455 


gtaaatacttttataagaatggtag 


gtg 


taa 


100388 


3AORF010 


3 


13491. .14738 


415 


gaggcggactaacgctacagtaaaa 


att 


taa 


100389 


3AORF011 


2 


2039. .3277 


412 


at t aaaga cat aa t gcg 1 1 aaggag 


gtg 


taa 


100390 


3AORF012 


2 


4001. .5209 


402 


aaaaaagagaaaaaat t aaacgcga 


. atg 


taa 


100391 


3AORF013 


1 


30379. .31545 


388 


attttatgaatgcgagaataaatgc 


atg 


taa 


100392 


3AORF014 


2 


14738. .15562 


274 


attatatgggaggtttgactaatta 


atg 


tag 


100393 


3AORF015 


3 


3249. .4034 


261 


cttgaattaagaaaatctttgaaag 


gtg 


tag 


100394 


3AORF016 


-2 


25587. .26273 


228 


aagaagc t aagaaaaaaa t aaaaa t 


atg 


tga 


100395 


3AORF017 


3 


6729. .7370 


213 


ttaatttttaaggaggaaataagca 


atg 


taa 


100396 


3AORF018 


3 


24540. .25154 


204 


aataaaataaaaagtaggtgataag 


atg 


taa 


100397 


3AORF019 


2 


31565. .32128 


187 


c t at aaaa a 1 1 aaaaaggacggt a t 


at a 


taa 


100398 


3AORF020 


3 


36150. .36713 


187 


gcagtaggaattatgacgggtcaag 


ttg 


taa 


100399 


3AORF021 


2 


24011. .24535 


174 


g t aa t aa aa 1 1 1 at aaagaaaggaa 


atg 


tga 


100400 


3AORF022 


-2 


12423. .12938 


171 


taaagtaccagtagacaatgtaggt 


att 


tga 


100401 


3AORF023 


1 


7462. .7917 


151 


aaaa t aaat c aaaggagaat aat 1 1 


atg 


taa 


100402 


3AORF024 


1 


26731. .27174 


147 


actaaataaaaataaggaggacact 


atg 


tga 


100403 


3AORF025 


1 


42106. .42543 


145 


taagcataagtaatggaggtataag 


atg 


taa 


100404 


3AORF026 


2 


35255 . .35671 


138 


aagcaactaactttattttaaggag 


ata 


taa 


100405 




2 


5888 . . 6298 


136 


atattggctataatacagtggtttt 


ate 


taa 


100406 


3AORF028 


-3 


27845. .2825S 


136 


ccttttaagatgtttatgatccttt 


ctg 


taa 


100407 


3AORF029 


3 


34344 . .34748 


134 


ttaaggttttagatttagaggtgga 


atg 


taa 


100408 


3AORF030 


2 


6299 . .6694 


131 


t a t aaaaaaggagt t ggc cagat aa 


atg 


tag 


100409 


3AORF031 


1 


20833 . .21225 


130 


ttaacaaaattataggagtgagaaa 


ata 


taa 


100410 


3AORF0 3 2 


-2 


39984 . .40361 


125 


aaatagctgttagagggttacccct 


ata 


tag 


100411 


3 AORF0 3 3 


1 


7957. .8325 


122 


gaatatctgcgtcttttttatttga 


ata 


taa 


100412 


3AORF034 


-2 


28506. .28871 


121 


gttatcaacctaaggaggtgataac 


atg 


tag 


100413 


3AORF035 


-2 


10671. .11036 


121 


tcctagcttcctaacagcaccgcca 


ata 


tga 


100414 


3AORF036 


2 


30020. .30382 


120 


ac caa 1 1 1 1 aaggaggag 1 1 aa t c a 


atg 


tga 


100415 


3AORF037 


2 


21818 . . 22165 


115 


aagtgtaagtaatagttaagagtca 


gtg 


tag 


100416 


3AORF038 


-2 


42003 . .42347 


114 


gtactcactttcaactgcttcaacc 


ate 


tga 


100417 


3AORF039 


2 


21386 . . 21727 . 


113 


tccagaaaatctagagtcataggtt 


ata 


taa 


100418 


3AORF04 0 


-3 


29654 . . 29995 


113 


ttgattaactcctccttaaaattgg 


ttg 


taa 


100419 


3AORF041 


-1 


4333 . .4671 


112 


tactaaatctacatctgatccatga 


att 


tga 


100420 


3AORF042 


3 


5568. .5900 


110 


taaaaaagtggtaggtgatttttaa 


atg 


tga 


100421 


3AORF043 


1 


25690. .26019 


109 


taccaaattaatatagtcttcgcat 


ata 


tag 


100422 


3AORF044 


3 


29676. .30005 


109 


gtcttaaataattatataaggagtt 


att 


taa 


100423 


3AORF04 5 


3 


30. .353 


107 


cgctagcaacgcggataaatttttc 


atg 


taa 


100424 


3AORF04 6 


3 


27894. .28214 


106 


aagatattgaaaagctaatttcccc 


ata 


tga 


100425 


3AORF04 7 


-2 


11907. .12227 


106 


1 1 cgccgc caaaatgat t agca 1 1 1 


ctg 


tga 


100426 


3AORF048 


-3 


40343. .40663 


106 


ccataacacatacactgtatgatct 


ctg 


taa 


100427 


3AORF049 


-3 


6749. .7069 


106 


tgttaaaccatcttcagattctcca 


ata 


taa 


100428 


3AORF050 


1 


42700. .43014 


104 


ttatgcaatcaaagaggtgtaagag 


atg 


taa 


100429 


3AORF051 


-2 


13077. .13388 


103 


ttgtacgtaatcccacacatcgccg 


att 


tga 


100430 


3AORF052 


-3 


3722 . .4024 


100 


gcatttcatttcctcctaataactc 


att 


tga 


100431 


3AORF053 


3 


1714S. .17444 


99 


tcgagacaatggatatagggagtgt 


att 


tag 


100432 


3AORF054 


-1 


19915. .20211 


98 


ataatttatagcttgcgaaacataa 


ata 


tga 


100433 


3AORF055 


-1 


42436. .42729 


97 


aatcgtattgatatgacttacgacc - 


atg „ - 


-tag 7 


100434 


3AORF056 


3 


40455. .40745 


96 


taaattttgtatacaaggtgaataa 


a£g 


tga 


100435 


3AORF057 


-1 


38665. .38952 


95 


atcatcaccgtcttgccattgacgt 


att 


taa 


100436 


3AORF058 


-1 


21265. .21549 


94 


gaaatttctatctaacttgtcataa 


att 


tga 


100437 


3AORF059 


-2 


10278. .10562 


94 


tttagccgcgcttccaactgcacgt 


att 


tag 


100438 


3AORF060 


1 


5278. .5556 


92 


atat cagccgaataggggtgatgaa 


atg 


tag 


100439 


3AORF061 


1 


35668. .35946 


92 


1 1 1 ggaaagaaggagagt t ga 1 1 a a 


ata 


taa 


100440 


3AORF062 


2 


35912. .36187 


91 


gt t aaat t tggaatggaat t aaaca 


ata 


taa 
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100441 


3AORF063 


3 


36720. .36995 


91 


cggaagtagcggagtgtaaagacat 


att 


tga 


100442 


3AORF064 


-2 


35694. .35969 


91 


ccgttatacgcgctagcactaataa 


ctg 


taa 


100443 


3AORF065 


-2 


32697. .32972 


91 


aaccgttttcttttgtaaattaggt 


ata 


taa 


100444 


3AORP066 


3 


29157. .29429 


90 


caaactttaacatttatctaaagga 


gtg 


tag 


100445 


3AORF067 


-2 


26661. .26930 


89 


atacttttttagcggaatcggatga 


ttg 


taa 


100446 


3AORF068 


-2 


9624. .9893 


89 


ttttaatgcatctcccatgtattga 


ata 


tga 


100447 


3AORF069 


-3 


13847. .14110 


87 


tgcatttcctcctgattcgtgttga 


ate 


tga 


100448 


3AORF070 


1 


34993. .35250 


85 


tttacgtccaaagagcttttgactt 


gtg 


taa 


100449 


3AORF071 


2 


34745 . .35002 


85 


aaatgttcaagaaatggagtgaagc 


ata 


tga 


100450 


3AORF072 


-1 


27379.-27636 


85 


tttgtcgttcctcctctaagttgtt 


ttg 


taa 


100451 


3AORF073 


2 


37367. .37615 


82 


tggtaatagctattatcatttttga 


att 


taa 


1004S2 


3AORF074 


-2 


23466. .23714 


82 


cgtttgtttttttaaaatttaatat 


att 


taa 


100453 


3AORF075 


-3 


2471. .2719 


82 


agtactgtttgaaatcttctaacac 


ttg 


tga 


100454 


3AORF076 


1 


26047. .26292 


81 


aagtacgttttcttggcggggaggt 


gtg 


tag 


100455 


3AORF077 


2 


28292.-28537 


81 


aacatcttaaaaggaggaataacaa 


atg 


tag 


100456 


3AORF078 


-1 


5836. .6075 


79 


ttttgtataaggcttagatttagtc 


att 


taa 


100457 


3AORF079 


-2 


5460. .5699 


79 


attcagtcgcttttaaaatttctct 


ate 


taa 


100458 


3AORF080 


-2 


31350. .31586 


78 


cctgtaatcactttagttttattta 


ata 


taa 


100459 


3AORF081 


-3 


8252. .8488 


78 


aagttttcttaaatccgtacctgta 


atg 


tga 


100460 


3AORF082 


-1 


35905. .36138 


77 


atatttatagacaacttgacccgtc 


ata 


taa 


100461 


3AORF083 


-1 


34039. .34272 


77 


atagttcacctggattattaaataa 


ata 


tga 


100462 


3AORF084 


-1 


12007 . .12240 


77 


acatttttttcatttcgccgccaaa 


atg 


taa 


100463 


3AORF085 


-2 


32367 . .32597 


76 


cttacaaggtatagagaaataacga 


att 


taa 


100464 


3AORF0B6 


-2 


30618 . .30848 


76 


atataatctaagttgaggattatct 


ata 


taa 


100465 


3AORF087 


-3 


24746 . .24973 


75 


ataggttttaagttcaccctcttca 


atg 


tga 


100466 


3AORF088 


-3 


12980 . .13204 


74 


tctttctttttcgtaccaccatgga 


att 


tag 


100467 


3AORF0B9 


3 


4290. .4508 


72 


acaggagaagcttatcaatctttaa 


atg 


taa 


100468 


3AORF090 


3 


28926. .29141 


71 


ttatacacgaaaggagcataaacaa 


atg 


taa 


100469 


3AORF091 


-2 


13587. .13802 


71 


cttgtcttgctaattgcttagataa 


atg 


tag 


100470 


3AORF092 


2 


26471. .26683 


70 


aaacgaaacaaaaggagggggttca 


atg 


taa 


100471 


3AORF093 


-1 


2524. .2736 


70 


tccaccgttttcttcatagtactgc 


ttg 


tga 


100472 


3AORF094 


-3 


25334. .25546 


70 


tggcgctttaatataaaagacgtct 


att 


tga 


100473 


3AORF095 


3 


8316. .8525 


69 


aagagatgggaaagacagaagaaca 


ate 


tag 


100474 


3AORF096 


2 


36992. .37198 


68 


aacaagttcaagggagctatgagga 


atg 


tga 


100475 


3AORF097 


-1 


32593. .32799 


68 


aaagcttaatacctctgtcgtttat 


atg 


taa 


100476 


3AORF098 


-1 


15346. .15552 


68 


aatccattaaatcacctacctataa 


ata 


tag 


100477 


3AORF099 


1 


7225. .7428 


67 


actggtgactggatgaacagaaaag 


ttg 


tag 


100478 


3AORF100 


-2 


22620. .22823 


67 


cgacttcatgaccggcatgtcttaa 


ata 


taa 


100479 


3AORF101 


-1 


40060. .40260 


66 


aaccttacagcgagaagggaaagag 


gtg 


taa 


100480 


3AORF102 


-1 


35035. .35235 


66 


ttctatctccttaaaataaagttag 


ttg 


taa 


100481 


3AORF103 


-2 


1149. .1349 


66 


atttttttggagtgttgggtaatca 


ata 


taa 


100482 


3AORF104 


1 


27661. .27858 


65 


aaacaacttaaaggaggaacgacaa 


atg 


tga 


100483 


3AORF105 


-2 


9420. .9617 


65 


gcctaagtcaaccgcttgattagac 


atg 


tga 


100484 


3AORF106 


-2 


23244 . .23438 


64 


caccagtaattcttgaattagttga 


ata 


taa 


100485 


3AORF107 


2 


11966 . .12157 


63 


tctaaaaaagatgctgtagtagacg 


ttg 


taa 


100486 


3AORF108 


-3 


35054 . .35245 


63 


ttttcatcatttctatctccttaaa 


ata 


tag 


100487 


3AORF109 


-3 


16010. .16201 


63 


gttcttaattccaatgtactgacag 


ttg 


taa 


100468 


3AORF110 


-1 


6184 . .6372 


62 


attttcagtgactttataatagtat 


att 


taa 


100489 


3A0RF111 


-2 


16500. .16688 


62 


gtagtcaacaattgctttgtattga 


ttg 


tga 


100490 


3A0RF112 


-2 


8502. .8690 


62 


cttaattctcgcctgatacttttcc 


att 


taa 


100491 


3A0RF113 


1 


34162 . .34347 


61 


tatgaaggattaggagtgtgattgc 


atg 


tga 


100492 


3AORF114 


2 


12356. .12541 


61 


ggatatcacactaaggctatagcta 


ata 


taa 


100493 


3AORF115 


-2 


7635. .7820 


61 


tgaagttccctcagctacaccgtga 


att 


tga 


100494 


3AORF116 


-1 


26434 . .26613 


59 


tttagcttctgaagttgtaaaatct 


ctg 


tga 


100495 


3A0RF117 


-3 


17804. .17983 


59 


atagccattatttctagcttgtgtc 


atg 


tga 


100496 


3AORF118 


2 


27899. .28075 


58 


attgaaaagctaatttccccataag 


att 


taa 


100497 


3AORF119 


-1 


39268. .39444 


58 


acgaaaccggtcaacttgtttagat 


atg 


tga 


100496 


3AORF120 


-2 


37152. .37328 


58 


tagctattaccatgaaacttcagct 


ctg 


taa 


100499 


3AORF121 


-2 


18900. .19076 


58 


aaggtactctctcccatttaccact 


att 


taa 


100500 


3A0RF122 


-1 


21550. .21723 


57 


taagcatggtaatcacctcctttaa 


atg 


taa 


100501 


3A0RF123 


-3 


33062. .33235 


57 


aaacgttgttctttaataagatctc 


ttg 


tag 


100502 


3AORF124 


2 


21212. .21382 


56 


aaattagaagaggttaaaggagaga 


ctg 


tag 


100503 


3A0RF125 


-1 


22051. .22221 


56 


aaatcaggattgaactgcttcccta 


atg 


tga 


100504 


3A0RF126 


-2 


7821. .7991 


56 


tgtttttcctgttttacggtcttta 


att 


tga 


100505 


3A0RF127 


-3 


34712. .34882 


56 


ttgcattacctattgcgaatgctag 


ttg 


taa 


100506 


3AORF128 


-3 


24056. .24226 


56 


tttttaaaatcaaagcgtctttgtt - 


ata 


-taa 


100507 


3A0RF129 


-3 


4940. .5110 


56 


cataccatgcagttaatacaaacaa 


ata 


tga 


100508 


3A0RF130 


3 


27171. .27338 


55 


cagaattaactatcgatgatttcga 


atg 


taa 


100509 


3AORF131 


-1 


40387. .40554 


55 


ccttctggcataataataattctat 


ctg 


taa 


100510 


3AORF132 


-2 


1860.-2027 


55 


gcgataacattcacctccttaacgc 


att 


tga 


100511 


3AORF133 


-3 


42317. .42484 


55 


acaaagtcctttcgtattgtagtaa 


ctg 


tag 


100512 


3AORF134 


2 


12671. .12835 


54 


tcatacaaatctttaaaaggttgga 


ctg 


tag 
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L00513 


3AORF135 


-1 


39484. .39648 


54 


ataatagtatttagcttctgcccag 


att 


taa 


100514 


3AORP136 


1 


29710. .29871 


S3 


accttacaacaaaaaatactatcac 


att 


taa 


100S15 


3A0RF137 


1 


37186. .37347 


53 


ggcagttgtttgaaaatataaggga 


gtg 


taa 


100516 


3AORF138 


2 


20996. .21157 


S3 


aatggggaaatagtttttaacgaag 


att 


taa 


100517 


3AORP139 


3 


15114 . .15275 


53 


tcaactgaaattgaagtaagtttaa 


atg 


taa 


100518 


3AORP140 


3 


29442. .29603 


S3 


aaaatggtattaggaggattatcaa 


atg 


taa 


100519 


3AORF141 


-1 


39883. .40044 


53 


tacaccataatcttttccaaatcga 


att 


taa 


100520 


3AORF142 


-1 


20416 . .20577 


S3 


accacctggaaaagtcccataaaaa 


att 


tga 


100521 


3AORF143 


-1 


1942 . .2103 


53 


ataaagcttagaagttgactgatca 


ate 


taa 


100522 


3AORF144 


-3 


39380 . .39541 


53 


ttccaccagtttcatctcttaagaa 


ate 


taa 


100523 


3AORF145 


3 


20388 . .20546 


52 


tctgagtggtcagaattagctatta 


atg 


taa 


100524 


3AORF146 


-2 


2358. .2S16 


52 


aacatgtccatattatgaacaatca 


att 


tga 


100525 


3AORF14 7 


-3 


5606. .5764 


52 


gtgatttgtttgtggtagatattca 


att 


tga 


100526 


3AORF148 


2 


34145 . .34300 


51 


tttacttctccgttttatatgaagg 


att 


taa 


100527 


3 a OP PI 4 o 


_1 


7918 . . 8073 


51 


tattctcttgatttactaattctaa 


at a 


taa 


100528 


3AORF150 


-2 


11745 . .11900 


51 


ttcatccttatgtctttgatcagca 


ata 


taa 


100529 


3A0RF151 


-3 


7097. .7252 


51 


tttaccttcatgatacccgtataca 


ata 


tga 


100530 


3AORF152 


1 


21652 . .21804 


50 


ctaaaaatattagagacatcaagat 


Qtq 


taa 


100531 


3AORF153 


2 


5381. . 5533 


50 


tcggctaagtctgaattactattaa 


QtQ 


taa 


100532 


JAvAf X ^ it 


_ i 


39670. .39822 


50 


ttgataaaatcgtcttctttcaaag 


ata 


taa 


100533 


3AORF155 


-1 


38233 . .38385 


50 


a t agg etc t ac aaaat gc acc aac a 


att 


tag 




J/MJXiT 1 JO 


. A 


33040 33192 
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Table 9 

Bacteriophage 96, complete genome sequence 

1 catagttata ggcttttcag ctatatacca agataagatt tatcccgccg tctccataaa aatatgcttg 

71 gaaaccttga tttaatgggg ttttaatcta gcaagtgtca aatatgtgtc aagaaaataa ttttctgaca 

141 cgttgacctt gctctttttt atgttcatca agtaagtgag agtaggtgtc taaagttata gatatattat 

211 aatggcctaa tcttttgcta atatattcaa taggtatacc tttagaaagt aggaaagatg tatgcgtgtg 

281 tcttaatgaa taaggtgtta ttgtagtatc atttagtcct atttgactct tagcatggtt aaatgacttt 

351 ttaacggcat tatgactcaa tttaaacaac ttattatctg tacgttttgg taattttgat aatttagctt 

421 taatatgttg tatatccttt tttggtacct ccacaagtct gtccgcgtta actgtttttg ttccacgaag 

491 atgtattgta ccctcttttt cgtttagatc gataggcaac atattaatta catcgctgta tcttgcacca 

561 gtgatagcta ggatgaataa aaaaatataa ctcgattcgt ctctagattt aaagtattct atcaattgca 

631 agtattgttc tatggtgatg aatttagagt gttcgtcttt tgattttttt gtaccacgaa tatctatttg 

701 atagctaggg tctttcttta aatagccctc atatactgca tctctgaagc attgtgataa acaactgttt 

771 aatttacgaa ccgtttcatt agtacgacct cgaccgaatt cgttcaaaaa cttttgatac tccgaacgtt 

841 tgatgttttt tattaaaaaa tcactcccga aatattcgtt aaataatttt aatgaacgtt gataccaata 

911 gaattgttgt gaagcgacat gtttcttatt ttttgaatct aaccaatcat tgtaatattc ttcaaacttt 

981 ttattttcat ctaaattgtt tccatcatcc aaatctctaa gcagttgttg agcagcgttg gttgcctcag 

1051 ctttagtttt gaatcctgac tttcttttct ttcctgattt gaaagacgga tgttttacgt cgtactgcca 

1121 agatgctgtt gctttattct tcctttttgt aattgtaaat gacgccattt tacttttcct cctcaaaatt 

1191 ggcaaaaaat aataagggta ggcgagctac ccgaaatttt attgttgaac aactattgct tcacttcttg 

1261 cttttcctac ttcttttcta aaactatcat atgattgatt agggtgtgtt aacgacattc ctggaccacc 

1331 tccagcatgt tggtttttgt ccggattatt ttccatttct tcagtggctc ttttagcatt taaatattct 

1401 tcgtaactag gttcgtttgg gtcgcgtggt tgtgcttgtt gtccattatt ggtagctgga agattcttct 

1471 gtacctgttg cttagatgtg ttattggttt gttgattgtt gttaatgttt gtgttgttct cgttgtttac 

1541 ttgattattg ttatcgtttt gattactatt ttcttttttc gcttctgctt tatctttagt ttctttcttt 

1611 ttgtctttgt tctctttctt tgtttcggtt ttcttgcttt cctctttctt atcgccgtcg ttgctaccgc 

1681 atgcacctaa cactaacgca ctagctaata ataaaactaa taatcttttc atgttttaca ctcctttatt 

1751 tgctatttgt tttaataaat ctatgatttc attgttttgt tctatgattt tgttttcatt tttaagatgt 

1821 tcgtctaaca tctctattaa gacgaaattt tgatttatca tttcgtaagt aaacatttga cctgtgttgt 

1891 taggattaga aaacgaacta ctgaaacgcg ttgaaaagct atctataaat tgaccaactt tattttttaa 

1961 taacatatct ttaccgctct cagacattgt atttagttcg cgcttattta aagttttttc tataattttg 

2031 tattttgttt cctgatttct ttcgatttct tctacttcaa aagggatatt gttattaaat ttttcgataa 

2101 tatcacgttt ttcagaaact gacatacgat caaatacttg tttttgacct ttatttaact tccctcgaat 

2171 ttttccggca gtccaagact ctttaactgt taacttatca ttaggaactt gattcatctt ttatatgact 

2241 ccttttctca tatttcttta tatttaaaaa ctctcaacgg ctcaaatgta atcgaatact cgccatagtg 

2311 agttccaata ccgtatatct tcttatattg ttctattgcc tccaatatgt attcttcgct taattgtaga 

2381 tactcagaca actcatacaa gttacgtacg ccataattgt aagcttctac aatttcgcgt aacgggactg 

2451 ctgagataaa gccgtgtcgt cttgcgtaat tttcgaactt gcgattgttg aatttcgatt gatctaaaat 

2521 gttgccatac gtcaacttgt ggtgggcaag ttcttcatat aatacttcta atttgttcct ttcggataag 

2591 gaaggtctaa taaaaatttc tccttcttga taccaaccat cgaatcctcg aggtactctt tgtgtttctt 

2661 tcacttcaac ttcacatttc ataagcaatt cttcgtattt tcccatgcgc caaacccctt tggtgtctta 

2731 tttctttcta tctctaaccc attgcataaa attttcgatt tcttcccatt cttcgggagt aaattcatct 

2801 ttatttgcat gaccggctat agtttcttga tgaatacttc tttcttctgt aattctcgat ttaggtacat 

2871 taaagtaatc tgctaattgt tggacttttg atattctagg atatttaagt tctttaagcc agttagagat 

2941 tgttgattga cttaccccga ttgcttcaga caattctact tgagtaatgt tgttctcttt cataagttgt 

3011 tctaagttct ctgataaaat ttttctagca ctcttatatt ccataatttt ctcctttagt attacttaat 

3081 gtaatactaa tttaccataa gtaatatcac ttttcaatac aaaatattac ttttttgaaa taaatatcac 

3151 tttaggtgtt gacatattac tttaagtgat agtatagttg taaatgtcaa cgggaggtga tacgaaatgc 

3221 cagaaaattt taaagagttc tctgtaaagg tctggagaac taattcgaat atgacacaac aagatgtcgc 

3291 tgataaatta ggcgttacta aacaatctgt aataagatgg gaaaaagatg acgcagaatt aaaaggctta 

3361 caattgtatg ctttagccaa attattcaac acagaagttg attatataaa ggctaaaaaa atttaacatt 

3431 aatatcactt taagtgataa aggaggaaac tgaaatgcaa gaattacaaa catttaattt tgaagaatta 

3501 ccagtaagga aaattgaagt ggaaggagaa cccttctttt taggtaagga tgttgctgaa attttagggt 

3571 atgcacgagc agataacgcc atacgcaatc atgttgatag tgaagatagg ctgatgcacc aaattagtgc 

3641 gtcaggtcaa aacagaaata t gat cat cat caacgaatct ggattataca gtttaatctt tgacgcttct 

3711 aaacaaagta aaaacgaaaa cattagagaa accgctagga aattcaaacg ctgggtaact tcggaagttt 

3781 taccgacgtt aagaaaaact ggtgcttacc aagtacctag tgacccaatg caagcattga gattaatgtt 

3851 tgaagctaca gaagaaacaa aacaagaaat taaaaacgtg aaagatgatg ttattgattt gaaagaaaat 

3921 caaaaactgg atgcgggaga ctacaatttc ttaactagaa caatcaatca aagagtagct catatacaaa 

3991 gactacatgc gataacaaac caaaaacaac gtagcgaatt attcagggat attaattcag aagtgaaaaa 

4061 gatgactggt gcgagttcaa gaacgaacgt aagacaaaaa catttcgacg atgtaattga aatgattgct 

4131 aattggttcc cgtcacaagc tactttatac agaatcaagc aaattgaaat gaaattttaa aacgaaatat 

4201 aggagaggct gaatatggaa tacatcggat atgcagacgc aaatgcgttt gtaaaaataa gtggcatttc 

4271 aaaagatgat ctagagaaaa aagtctactc gaacaaagag tttcaaaaag aatgcatgta cagatttggt^ 

4341 cgaggacaaa agcgttatat aaaaattgac aaagctattc aatttatcgg taccaattta atgattaatg 

4411 aatacgaatt ataggaggag ttatcaaatg agtaaaactt ataaaagcta cctagtagca gtactatgct 

4481 tcacagtctt agcgattgta cttatgccgt ttctatactt cactacagcg tggtcaattg caggattcgc 

4551 aagtatcgca acattcatat actacaaaga atacttttat gaagaataaa aaaactgcta cttgcgtcaa 

4621 caagtaacag tgacaaacat ttatcaaaat atacaactta attaaatcaa aatatacgga ggcagtcaac 
4691 tatggctgaa aatattaaaa ctgaacaaca ttattacact aaagatttct caggatacag aaatgaagaa 
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4761 gataactttg tagcaaatca agaattgaca gcaacaatca cattgaacga gtacagaaaa cttattgaaa 

4831 taaaggctgt taaagataaa gaagaagata cctacagagg taagtatttt gcggaagaaa gaaaaaacga 

4901 aaaattggaa aaagaaaata taaaactaaa aaacaaaatt tatgaattac aaaacgaaga agataacgag 

4 971 gaggacgaag aagacaagga ggacgagaac gatgtattac aaaattggtg agataaaaaa caaaattata 

5041 agctttaacg ggtttgaatt taaagtgtct gtgatgaaga gacatgacgg tatcagtata caaatcaagg 

5111 atatgaataa tgttccactt aaatcgtttc atgtcataga tttaagcgaa ctatatattg cgacggatgc 

5181 aatgcgtgac gttataaacg aatggattga aaataacaca gatgaacagg acaaactaat taacttagtc 

5251 atgaaatggc aggaggtatg aaaagtgaat gatttacaag agagagaatt agaaacattc gaacaagacg ♦ 

5321 accgattcaa agtaactgat ctagacagtg ctaactgggt ttttaagaaa ctggatgcaa tcacaactaa 

5391 agagaatgaa atcaacgatt tagcaaataa agaaattgaa cgcataaacg aatggaaaga taaagaagta 

5461 gaaaaattac agagtggcaa agaatattta caaagccttg taattgaata ttacagaata caaaaagaac 

5531 aagatagcaa attcaagttg aatacacctt acggaaaagt gacagccaga aaaggttcaa aagtcattca 

5601 agttagcaat gagcaagaag tcattaaaca acttgagcaa cgaggttttg acaactatgt aaaagtaact 

5671 aaaaaactta gccaatcaga cattaagaaa gatttcaatg taactgaaaa cggcacattg attgacgcaa 

5741 acggcgaagt tttagagggt gctagcattg tggagaaacc aacgtcatac acggtaaagg tgggagaata 

5811 gatgactgaa aaaactaatc aagatgtcga tattttaacg caactaggtg taaaagacat cagcaaacaa 

5881 aatgcaaaca agttttataa atttgcgata tacggcaagt tcggtactgg taaaactacg tttttaacaa 

5951 aagataacaa taccttagta ctagatataa atgaggacgg aacaacggta acagaagatg gggcagttgt 

6021 gcagattaag aattataagc attttagtgc agtgattaaa atgctgccta aaattattga acaactaaga 

6091 gaaaacggaa aacaaattga tgttgtagtg attgaaacaa tccaaaagtc acgtgatatc actatggacg 

6161 acatcatgga cggtaaatca aagaaaccga catttaatga ttggggcgag tgtgctacac gcattgtaag 

6231 tatttatcgt tatatttcta aattacaaga acattatcaa tttcatcttg ctataagcgg acacgagggc 

6301 attaacaaag acaaagatga tgagggaagt actatcaatc caacaatcac gatagaggca caagaccaaa 

6371 taaaaaaagc agtcatcagt caatctgacg tgttagcaag aatgacaata gaagaacatg agcaagacgg 

6441 cgaaaaaact tatcaatatg tacttaacgc tgaaccatca aatttattcg agacaaagat aagacactca 

6511 agcaacatca aaattaacaa caaacgtttc attaatccaa gtattaacga tgttgtacaa gcaattagaa . 

6581 atggtaatta aaaattaatt aaaaggacgg tataaaaatt atgaaaatca ctggtagaac acaatacatt 

6651 caagaaacta atcaagaggc attcatgaaa ggtggggact ttttaggagc tggagaattt acagtaaaag 

6721 ttgcaaatgt cgagtttaac gacagagaaa acagatactt cacgattgtt tttgaaaaca acgaaggtaa 

6791 acaatacaaa cacaaccaat tcgtcccacc attccaacaa gattatcaag aaaaacaata tatcgagtta 

6861 cttagtagat taggaattaa attgaactta ccagatttaa cttttgacac agatcaatta attaacaaaa 

6931 tcggaactat tgtacttaaa aataaattta acgaggaaca aggcaagtat tttgtaagac tctcatatgt 

7001 aaaagtttgg aataaagacg atgaagtagt taataaacca gaacctaaaa ctgatgagat gaaacaaaaa 

7071 gaacagcaag caaatggtaa acagacacct atgagtcaac aatcaaaccc attcgctaat gctaatggtc 

7141 caatagaaat caatgatgat gatttaccgt tctaggacgt ggtttaaatg caatacatta caagatacca 

7211 gaaagacaat gacggtactt attccgtcgt tgctactggt gttgaacttg aacaaagtca cattgattta 

7281 ctagaaaacg gatatccgct aaaagcagaa gtagaggttc cggacaataa aaaactatct atagaacaac 

7351 gcaaaaaaat attcgcaatg tgtagagata tagaacttca ctggggcgaa ccagtagaat caactagaaa 

7421 attattacaa acagaattgg aaattatgaa aggttatgaa gaaatcagtc tgcgtgactg ttcaatgaaa 

7491 gttgcgagag agttaataga actgattata tcgtttatgt ttcatcatca aatacctatg agtgtagaaa 

7561 . cgagtaagtt gttaagcgaa gataaagcgt tattatattg ggctacaacc aaccgcaact gtgtaatatg 

7631 cggaaagcct cacgcagacc tggcacatta tgaagcagtc ggcagaggta tgaacagaaa caagatgaat 

7701 cactacgaca aacatgtgtt agcactgtgt agacaacatc ataatgaaca gcacgcaatt ggtgttaagt 

7771 cgtttgatga taaatatcaa ttgcatgacC cgtggataaa agttgatgag aggctcaata aaatgttgaa 

7841 aggagagaaa aatgaataag ttactaatag atgactatcc gatacaagta ttaccgaaat tagctgaatt 

7911 aatagggtta aacgaagcaa tagtattgca acaaattcat tattggctaa acaactcaaa acataaatac 

7981 gatggcaaaa cttggatttt taattcttat ccagaatggc aaaaacaatt tccattttgg agcgagagaa 

8051 ctataaaaag gacatttggg agtttagaaa aacaaaattt attgcatgta ggtaactaca acaaggctgg 

8121 atttgaccgt acaaaatggt attcaatcaa ttatgaaaca ttaaacaaac tagtggcacg accatcggga 

8191 caaaatggcc cgacgatgag gacaaatcgg cacgatgcaa gaggacaaaa tgacccgacc aataccatag 

8261 actacacaga gactaacaaa catagagaga cagacgacgt ctcaaagtca tttaagtata ttagtaccaa 

8331 tttagaaatt atacaaaacc ctttaaaagc agaacagtta gaacacgaaa ttaaatcatt taagcaagat 

8401 cagttcgaaa tagtaaaagt cgctaccgat tactgcaaag aaaacaacaa aggtctgaat tact t act aa 

8471 ctgtattaaa gaactggaat aaagaaggcg tttcagataa agaaagtgct gaaaacaaat tgaaacctcg 

8541 taactctaaa aaagaaacta ctgatgatgt catagcacaa atggaaaaag aattgagtga tgactaatgc 

8611 cgatgagcaa aacacaagca ttagaaatta ttaaaaaagt taggtacgta tacaacatcg attttgataa 

8681 accaaagtta gaaatgtgga ttgatgtatc aagtcaaaac ggggattatc aaccaactgt aaaagctgta 

8751 gatggatata tcaacagtaa caacccgcac ccgcctaacc taccagcaat catgcgtaag gcacctaaaa 

8821 aagtatctat tgagccggta gacaacgaaa ccgctacaca ccaatggaaa atgcagaatg accccgaata 

8891 tgtcagacaa agaaaaatag cgctagataa cttcatgaat aagttggcag aatttggggg cgataacgaa 

8961 tgaattacgg tcaatttgaa attgaaagca caataatcgc tacgctactt aaacaaccgg acgtactaga 

9031 aaagataaga gttaaagatt acatgtttac gaacgaaaag tttaaaacct ttttcaatta tgtaatggac 

9101 gtcggaaaga tagatcatca agaaatctat ttaaaagcaa ctaaagataa agagttttta gatgcagata 

9171 ctataactaa actttacaac tccgatttca ttggatacgg attctttgaa cgttatcaac aagaattatt 

9241 ggaaagttat caaatcaaca aagcgaaaga attggtaact gagttcaaac aacaacctac gaaccaaaat 

9311 tttaataact tgattgatga actcaaggat ttaaaaacaa ttactaacag aaaagaagac ggaaccaaga 

9381 agtttgttga ggagtttgtc gatgagttat acagcgatag ccctaagaag caaattaaga cgggttataa 

9451 gctcatggat tacaaaatag ggggattgga gccgtcgcaa ttaatcgtca tcgcagcgcg tccctcagtg 

9521 ggtaagacag gttttgcatt aaacatgatg ctgaacatag cacaaaatgg atacaaaaca tctttcttta ' '' 

9591 gtctcgaaac aactggcaca tcagtattga aacgtatgtt atcaacaatt actggtattg agttaacaaa^ 

9661 gataaaagaa atcaggaact taacgccgga tgacttaaca aagttaacga atgcgatgga taaaatcatg 

9731 aaattaggca tcgatatttc tgataaaagt aatatcacac cgcaagatgt gcgagcgcaa gcaatgaggc 

9801 actcagacag gcaacaagtt atttttatag attatcttca actgatggat actgatgcga aagttgatag 

9871 acgtgtagca gtagaaaaga tatcacgtga cttaaagata atcgctaacg agacaggcgc aatcatcgta 

9941 ctactttcac aactgaatcg tggtgtcgag tctagacagg ataaaagacc aatgctatcg gacatgaaag 

10011 aatcaggcgg aatagaagca gatgcgagtt tagcgacgct actttaccgt gatgattatt ataaccgtga 
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100 81 cgaagatgac agtatcactg gcaaatctat cgttgaatgt aacatagcca aaaacaaaga cggcgaaacc 

10151 ggaataattg aatttgagta ttacaagaag actcagaggt ttttcacatg aatataatgc aattcaaaag 

10221 cttattgaaa ccgatgtatg aagagacaaa gcaaagcgac ccgattgtag caaatgtata tatcgagact 

10291 ggttgggcgg tcaatagatt gttggacaat aacgagttat cgcctttcga tgattacgac agagttgaaa 

10361 agaaaatcat gaatgaaacc aactggaaga aaacacacat taaggagtgt taaaaaatgc cgaaagaaaa 

10431 atattactta taccgagaag atggcacgga agatattaag gtcatcaagt ataaagacaa cgtaaatgaa 

10501 gtttattcgc tcacaggagc ccacttcagc gacgaaaaga aaattatgac tgatagtgac ctaaaacgat 

10571 ttaaaggcgc ccacgggctt ctatatgagc aagagctagg attgcaagca acgatatttg atatttagag 

10641 gtggcacaat gagtaaatac aatgctaaga aagttgagta caaaggaatt gtatttgata gcaaagtaga 

10711 gtgcgaatat taccaatatt tagaaagtaa tatgaatggc actaactatg atcgtatcga aatacaaccg 

10781 aaatttgaat tacaacctaa attcgggaaa caaagaccga ttacgtatat agccgatttc tctttgtgga 

10851 aggaagggaa actggttgaa gttatagacg ttaaaggtaa ggcgactgaa gttgccaaca tcaaagcgaa 

10921 gatattcaga tatcagtata gagatgtgaa tttaacgtgg atatgtaaag cgcctaaata cacaggtcaa 

10991 gaatggatgg tatatgagga cctagtgaaa gtcagacgta aaagaaaaag agaaatgaag tgatctaatg 

11061 caacaacaag catatataaa cgcaacaatt gatataagaa tacctacaga agttgaatat cagcattacg 

11131 atgatgtgga taaagaaaaa gatacgctgg caaagcgctt agatgacaat ccggacgaat tactaaagta 

11201 tgacaacata acaataagac atgcatatat agaggtggaa taaatgaagt tgaacgaagt attcgcaact 

11271 aatttaaggg taatcatggc tagagataac gtaagtgtcc aagatttgca caatgaaact ggcgtatcaa 

11341 gatcaactat tagtggatat aaaaacggaa aagctgagat ggttaaccta aatgtattag ataaattggc 

11411 agatgctcta ggtgttaatg taagtgaact atttactaga aatcacaaca cgcacaaatt agaggattgg 

11481 attaaaaaag taaatgtata gaggtggaat aaatgagtat cgtaaagatt aacggtaaac catataaatt 

11551 taccgaacat gaaaatgaat tgataaaaaa gaacggttta actccaggaa tggttgcaaa aagagtacga 

11621 ggtggctggg cgttgttaga agccttacat gcaccttatg gtatgcgctt agctgagtat aaagaaattg 

11691 tgttatccaa aatcatggag cgagagagca aagagcgtga aatggttagg caacgacgta aagaggctga 

11761 actacgtaag aagaagccac atttgtttaa tgtgcctcaa aaacattctc gtgatccgta ctggttcgat 

11831 gtcacttata accaaatgtt caagaaatgg agtgaagcat aatgagcata atcagtaaca gaaaagtaga 

11901 tatgaacaaa acgcaagaca atgttaaaca accggcgcat tacacatacg gcaacattga aattatagat 

11971 tttatcgaac aggttacggc acagtatcca ccccaactag cattcgcaat aggtaatgca atcaaatact 

12041 tgtctagagc accgttaaag aatggtcatg aggatttagc aaaggcgaag ttttacgtcc aaagagcttt 

12111 tgacttgtgg gagggttaac gatggcaacg caaaaacaag ttgattacgt aatgtcatta caggaacaat 

12181 tgggattaga agactgtgaa aaatatacag acgaacaagt Caaagctatg agtcataaag aagttagcaa 

12251 tgtgattgaa aactataaga caagcatatg ggatgaagag ctatataacg aatgcatgtc gtttggtctg 

12321 cctaattgtt aaaaggagtg atgaccatga acgatagcgc acgcaaagaa tacttaaacc aatttttcag 

12391 ctctaagaga tatctgtatc aagacaacga gcgagtggca catatccatg tagtaaatgg cacttattac 

12461 tttcacggac attataaaac gatgtttaaa ggcgtgaaaa agacatttga tactgctgaa gagctcgaaa 

12531 Catatataaa gcaacatgat ttggaatatg aggaacagaa gcaaccaact ttattttaga ggagatggaa 

12601 ataatggcaa agattaaaag aaaaaagaag atgacgctac tcgaactggt ggaatgggca tggaacaatc 

12671 ctgaacaagt tgaaagtaaa gtgtttcaat cagatagaat gggcacgctt ggagaatgta gcgaagtaca 

12741 tttttcaact gatgggcatg ggttttatac aaaagtagta acagataaag atatttttac tgtagaaatc 

12811 acagaggaag tcactgaaga tactgagttt gattgtctag tagaactaaa cgatattgaa ggttttgaaa 

12881 tatatgaaaa tgattcaatc agagagttga tagacggtac ttccagagcg ttttatatac taaacgaaga 

12951 taaaactatg acattaattt ggaaagatgg ggagttggta gtatgatgca aacctataaa gtatgtcttt 

13021 gtatcaagtt ctttgcatct aaatgtgatt ataaattaaa gaaacattat ttcgtgaaaa gtacgaatga 

13091 ggaaaaagcc acgaacatgg tattaaaact gattcgtaaa aagctcccgt tcgaaactgc aagcatagaa 

13161 gtcgaaaaag tggaggcaat ataatgatac aaccaacaag agaagaatta attaatttca tgaaaaaaca 

13231 tggagctgaa aatgttgact ctatcactga tgagcaaagt gcaataagac actttagagc tcaatcaaaa 

13301 gtttttaaag acgaacgtga tgagtacaag aagcaacgag atgagcttat cgaggatata gctaagttaa 

13371 gaaaacgtaa cgaagagctg gagaacatgt ggcgcacagt caaaaatgaa ttgcttggaa gatacgaaca 

13441 ttactgtttt aaaattagag aactacaccc tgagagcaaa gcgaacagga taggagctct ctatatagga 

13511 ggtaaaagca ctgcagatat tatactgtcg cgaatggaag aactagacgg aacaaatgag ttctacgaat 

13581 ttttagggca aatggaggca gacacaaatg aataaccgtg aacaaataga acaatcagtg atcagtacta 

13651 gtgcgtataa cggtaatgac acagaggggt tactaaaaga gattgaggac gtgtataaga aagcgcaagc 

13721 gtttgatgaa atacttgagg gaatgacaaa tgctattcaa cattcagtta aagaaggtat tgaacctgat 

13791 gaagcagtag gggttatggc aggtcaagtt gtctataaat atgaggagga gcaggaaaat gagtattagt 

13861 gtaggagata aagtatataa ccatgaaaca aacgaaagtc tagagattgt gcaattggtc ggagatatta 

13531 gagatacaca ttataaactg tctgatgatt cagttattag cattatagat tttattacta aaccaattta 

14001 tctaattaag ggggacgagt gagtggaatg gaaacgatta aaaaatgtgg tgccgcaccc agttatcaaa 

14071 aataaaaatt taaagtcggt atacgtaaca aaagataatg tgaaagaggt tcaaaaagaa ttaggtttct 

14141 ttgaaatttt taatgaagaa gtgttattaa ctggattttt atcatttcaa aggataccta tttacattat 

14211 ttggattaat cctaaatctc ataagacgcc tagatattac tttgctaacg agcatgagat tgaaagatat 

14281 tttgaatttt tggaggacga gtaaatgctt gaaatcatcg accaacgtga tgcattgcta gaagaaaagt 

14351 atttaaacga cgactggtgg tacgagctag attattggtt gaataaacgc aagtcagaaa atgaacagat 

14421 tgatattgat agagtgctta aatttattga ggaattaaaa cgataggaga taacgaataa atgaataatt 

14491 taacagtaga tcaattaaaa gaacttttac aaatacaaaa ggagttcgac gatagaatac cgactagaaa 

14561 tttaaatgac acagtagcta gtatgattat tgaatttgcg gagtgggtta acacacttga gttttttaaa 

14631 aattggaaga aacaaccagg taagccatta gatacacaat tagatgagat tgctgattac ttagctttca 

14701 gtttgcaatt aactctgact attgttgatg aagaagattt ggaagagact actgaggtta tggttgattt 

14771 gattgaaaat gaagttactt tacctaaact acattcagtt tattttgttc atgtaatgca tacactaaca 

14 841 gaacaatttg taaaaggtat tgataatagt attgtacaag ttttaataat gccttttttg tacgccaata 

14 911 cttactatac aatcgaccaa ctcattgacg catacaaaaa gaaaatgaaa aggaaccacg aaagacaaga 

14981 tggaacagca gacgcaggaa aaggatacgt gtaaagacat cttagatcga gtcaaggagg ttttggggaa 

15051 gtgacgcaat acttagtcac aacattcaaa gattcaacag gacaaccaca tgaacatttt actgctgcta 

15121 gagataatca gacgtttaca gttgttgagg cggagagtaa agaaggagcg aaagagaagt acgagaaaca 

15191 agttaagata aggagagatg gagatgccaa agaaaacggt aacgattgat gtagatgaaa acttattagt 

15261 agtagctagt aatgaaatat cagaactatt atatgaatat gacagtgagt taatgtcagc tgatgaagat 

15331 ggcgataata gagatatcga aaaaaaaaga gacgcattaa aacaagctat acaaattatc gataaattaa 
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15401 catgtcgagg aggcagacga tgattaacat acctaaaatg aaattcccga aaaagtacac tgaaataatc 

1S471 aagaaatata aaaataaaac acctgaagaa aaagctaaga ttgaagatga tttcattaaa gaaactaatg 

15541 ataaagacag tgaattttac agtcctatga tggctaatat gaatgaacat gaattaaggg ctatgttaag 

15611 aatgatgcct agtttaattg atactggaga tggcaatgat gattaaaaaa cttaaaaata tggattggtt 

15681 cgatatcttt attgctggaa tactgcgatt attcggcgta atcgcactga tgcttgtcgt catatcgcct 

15751 atctatacag tggctagtta ccaaaacaaa gaagtatatc aagggacaat tacagataaa tataacaaga 

15821 gacaagataa agaagacaag ttctatattg tgttagacaa caagcaagtc atcgaaaact ctgacttact 

15891 attcaaaaag aaatttgata gcgcagacat acaagctagg ttaaaagtag gcgacaaagt agaagttaaa 

15961 acgattggtt atagaataca ctttttaaat ttatatccgg tcttatacga agtaaagaag gtagataaat 

16031 aatgattaaa caaatattaa gactattatt cttactagcg atgtatgagc taggtaagta tgtaactgag 

16101 aaagtatata ttatgacgac ggctaatgac gatgtagagg cgccgagtga ctccgcaaag ttgagcgatc 

16171 agtctgattt gatgagggcg gaggtgtcag agtagatgta tagcaaagag tcaattgtta atatgatagg 

16241 cacacataaa atgaagtgta atgtattagc tgatgtaata ccggaatatg atagcaattc aattgcacag 

16311 tatggcatac aagcaacgtt gccgaaacca caaggggaaa actcaagtaa agttgaagat gttgttgtga 

16381 ggcttgagag agcaaataaa aggtatgctc agatgttaaa agaggttgag tttataaatc aatcgcaaca 

16451 gagattggga cacgttgact tttgcttctt agagttattg aagaaaggtt ataacaggga tgcgattatc 

16521 aagaagatgc ctaactctaa attaaataga aacaacttct tagcgcgccg tgatgagtta gcagaaaaga 

16591 tttatctact acagtgacga aaatgacaaa aatgacagaa atgacgaaaa tgacactatt tttaaactgt 

16661 gaattaattt tatataattg atttgtaaga attatcttaa gacgtggggt aatagccaca ttagatgttc 

16731 tcatcgatgt gattgagaag tgacaaacat ataaaagatg atatgttacg ctattaatca cctactacct 

16801 gcctatatgg tgggtagttt aattcttgca ttttgagtca taactatttt cctcctttca catttattga 

16871 acgtagctcc tgcacaagat gtaggggcat tttttatatt taaataacta gagtaattaa cgtaaaggcg 

16941 tgtgatacag tgaaaacaat tgattaaatt aacaccgaag caagaaaagt ttgtgctagg actcatagag 

17011 ggcaagagcc aacggaaagc atatattgac gcagggtatt cgactaaagg taagagtggg gaatatctag 

17081 ataaagaagc gagtacactt tttaaaaatc ggaaggtttc cggaaggtac gaaaaattgc gtcaagaagt 

17151 agctgaacaa tcaaaatgga cacgccaaaa ggcctttgaa gaatatgagt ggctaaagaa tgtagctaag 

17221 aatgacattg aaatagaggg agtgaagaaa gcgacagctg atgcattcct cgctagttta gatggtatga 

17291 atagaatgac gttaggtaac gaagttttag ctaaaaagaa aatagaaact gaaattaaga tgcttgagaa 

17361 gaagattgaa caaatagata aaggtgacag tggaacagaa gataaaatca aacaacttca cgacgcaata 

17431 acggaagtga tcgtcaatga ataaacttaa atctttatat acggacaaac aaattgaaat attgaagcaa 

17501 acgcaaaaac aagattggtt tatgttaatt aatcacggag caaagcgtac aggtaaaaca atattaaaca 

17571 atgacttatt tttacgtgag ttaatgcgtg tgcgaaagat agcagacgaa gaaggaattg agacacctca 

17641 atatatactt gctggtgcaa cattaggtac gattcaaaaa aacgtactaa tagagttaac taacaaatat 

17711 ggcattgagt ttaatcttga taaatataat tcattcatgt tatttggcgt tcaagtggtt cagacaggtc 

17781 acagtaaagt aagtggtata ggagctatac gtggtatgac atcgtttggt gcatatatca atgaagcgtc 

17851 gttagcgcat gaagaggtgt ttgacgagat taagtcacgt tgtagtggaa ctggtgcaag aatattggta 

17921 gataccaacc ctgaccatcc cgagcattgg ttgttgaaag attatattga aaatacagat cctaaagcag 

17991 gtatactgag tcaccaattt aagctcgatg acaataactt tcttaatgat agatataaag agtctattaa 

18061 ggcttcaaca ccaccaggta tgttctatga acgtaatatc aacggtatgt gggtgtctgg tgacggtgta 

18131 gtatatgccg actttgattt gaatgagaat acgattaaag cagatgaact ggacgacata cctatcaaag 

18201 aatactttgc tggtgtcgac tggggttacg agcactatgg atctattgtg ctaataggac gaggtataga 

18271 tggtaacttt tattttattg aggagcacgc acaccaattt aagtttattg atgattgggt ggttattgca 

18341 aaagatattg taagtagata tggcaatatt aatttttact gcgatactgc acgacctgaa tacatcactg 

18411 aatttagaag acatagatta cgtgcaatta acgctgataa aagtaaacta tcgggtgtgg aggaagttgc 

18481 taagttgttc aaacaaaaca agttacttgt tctttatgat aatatggata ggtttaagca agaggtattt 

18551 aaatatgttt ggcaccctac aaacggagag cctataaaag aatttgatga cgtgttggac tcgttaagat 

18621 atgccatata cacacatact aaacctgaac gattaaggag ggggaaatga cattgtataa gttaatagat 

18691 gatattgaag cacaaggaat attgcctaag catattgagg ctctaataga gtcacataaa gacgatagag 

18761 agagaatggt taatctctat aatagataca agacacatat tgactatgta ccaatattca aacgtcgacc 

18831 aattgaagaa aaagaagatt ttgaaactgg tggaaatgta aggcgattag acgtgtctgt taataacaaa 

18901 cttaacaact cttttgacag cgaaattgtt gatacacgtg ttggttattt acatggtgtt cctgttactt 

18971 atgatttaga tgaaaacgca gaaaaaaacg aaaagttgaa aaagtttata accaactttg ccattagaaa 

19041 tagtgttgat gatgaggatt ctgaaatagg taaaatggca gcaatttgcg gatatggtgc taggttagca 

19111 tatattgata cgaatggtga tattaggatt aagaatatag atccctataa tgttattttt gttggcgaca 

19181 atattttaga acctacatac tcattgcgct acttttatga aaaagatgat gataatggca ctgattatgt 

19251 gtacgcagag ttttacgata atgcttatta ttatgtattt cgaggagaag gtattgacgc tttgcaagaa 

19321 gttggacgat atgaacattt atttgattac aatccattgt ttggtgtacc taacaacaaa gagatgatag 

19391 gagatgctga aaaggttatt cacttaattg acgcatatga tttaacaatg agcgatgcat caagtgagat 

19461 tagtcagaca cgtttagcat accttgtgtt acgcggtatg ggtatgagtg aagaaatgat tcaagaaaca 

19531 caaaagagtg gcgcatttga gttgttcgac aaagatatgg acgttaaata cttaacaaaa gatgtaaatg 

19601 acacaatgat tgagaaccat ttagatcgaa tcgaaaagaa tatcatgcgt tttgcaaagt cagtaaactt 

19671 taattctgac gagtttaacg gaaatgtacc tatcattgga atgaaactta aacttatggc tttagagaac 

19741 aagtgtatga cgtttgagcg taagatgaca gctatgttga ggtatcaatt caaagttatt ttatctgcat 

19811 taaagcgtaa agggtacaac ttggatgatg atagttattt aaacctgata tttaagttca ctcgtaacat 

19881 tccagttaat aagttagaag aatcacaagt gctaattaac ctgaagggac aagtttcaga acgaacaagg 

19951 ttaggacaat cacaactagt tgatgatgtt gattacgaat tagacgaaat ggaaaaagaa agtcttgaat 

20021 ttaatgacaa attacctgac atagatgaag gtgacgcaaa tgacaaatcc caaaataacc aatcagaatg 

20091 atattgatga gtatatcgag ggtttaatct ctaaagcaga aaaaccaata gaacaactat ttgctaatcg 

20161 acttaaagag ataaaacaaa tcatcgcaga tatgtttgag aaatatcaaa atgatgatgt gtatgtt-aca — 

20231 tggactgaat tcaataaata caacaggctc aataaggagt taactcgtat aggtacaatg ttgacttatg, 

20301 actataggca agtagctaag atgattcaga agtcacaaga agatgcttat atagaaaaat tccttatgag 

20371 cctttattta tatgaaatgg cgagtcaaac atctatgcag tttgatgttc cgagtaaaga ggtaatcaaa 

20441 tcagctattg aacaacctat tgagttcatt cgtttaatgc caacactaca aaaacatcgt gatgaagtat 

20511 tgaaaaagac acgtatgcac attacacaag gtattatgag tggagagggt tactctaaga tagctaaagc 

20581 aatacgtgat gatgtcggca tgtctaaagc tcaatcattg cgtgtggctc gtacagaagc aggcagagca 

20651 atgtcacaag ctggacttga tagcgcaatg gttgctaaag ataacggttt gaatatgaag aaacgttggc 
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20721 atgctactaa agatacacga acacgtgata ct cat eg tea tttagatggg gaatcagtgg aaatagatca 

20791 gaattttaaa tcaagtgggt gtgttgggca ggcgcccaag ctatttattg gtgtaaacag tgegaaagag 

20661 aatattaatt gtcgttgcaa attactttat tatattgatg aaaatgaatt gccaactgta atgagagcac 

20931 gtaaagacga tggtaaaaat gaagttatcc cattcatgac ttatcgtgag tgggagaaat ataagegaaa 

21001 aggtggtaac tgatatggat tttaaaataa aagtaaatgt tgatactggc gaagctatag aaaagttaga 

21071 aegcattaaa tccttgtacg aagagataat agagttacaa aacgaaaaag ttgttgtaaa cgtaacagtt 

21141 aaaaatgaag ctgatttaga tatggttaaa acatctatta gcgaagaaaa tgctaaaaat aatgatttca 

21211 cactttttta gttgtctctt tgetactega ccttagcatg tegttaaact gctttttatt atgcactttt 

21281 cggactgtta gggtacgega agggcaaaaa ggagttttga tatatgaata tcgaagaagt taagtctttt 

21351 tttgaagaac acaaagacga taaagaagta aaagattatc taaagggact taagacggtg tctgttgatg 

21421 aegttaaagg ctttttagat acagaagaag gtaaacgatt cattcaacct gaattagatc gttatcattc 

21491 gaaaggatta gaatcatgga aagagaaaaa tcttgaggat ctaatcgaac aagaagtacg gaagcgtaat 

21561 cccgagcaat cagaagaaca aaaaegtatt agtgctcttg aacaagagte agaaaaaege gaegcagagg 

21631 caaaacgtga gaagttaaga agtaacgege taggtaaagc gcaggaacta aatttaccaa catccttagt 

21701 tgatagattt ttaggegatt ctgatgaaga tactgagcaa aacttaaaag ctttaaaaga aacctttgac 

21771 aagtatgttc aaaaaggcgt tgagtctaaa tttaaatcga gtggaagaga tgttaaagaa tcacgaaatc 

21841 aagatttaga cccttcaaat gtaaagtcca ttgaagaaat ggcgaaagaa atcaatatta gaaaataaag 

21911 tgaggtaata aaatatggca actccaacat acacgccagg caatgttatt ttateggatt ttaaaaaegg 

21981 cgttattcca gcagaacaag gtactttaat catgaaagac attatggcta attcagcaat tatgaaatta 

22051 gctaaaaatg agecaatgae agcacaaaag aaaaaattta cttacttagc aaaaggtgta ggcgcctact 

22121 gggtatcaga aacggaacgt attcaaactt etaagectga atatgegcaa gcagaaatgg aagctaagaa 

22191 aattggtgta attattccgt tatcaaaaga gtttcttaaa tggactgcaa aagatttctt taatgaggtt 

22261 aaacctctaa ttgeagagge attttacaaa gcgtttgacc aagctgttat ctttggtact aaatcacctt 

22331 acaacacttc aactagtggt aaaccgcttg ttgaaggege agaagagaaa ggtaacgttg ttacagatac 

22401 taataattta taegtagace ttteggcatt aatggctact attgaagatg aagagttaga tecaaaegga 

22471 gtattaacta caegttcatt cagaagtaaa atgcgtaatg etttagatge taatgacaga ccattatttg 

22541 atgetaaegg gaacgagatt atgggattac cactatctta tactggagcg gatgtatacg acaaaaagaa 

22611 ategttagea ctaatgggtg attgggatta cgcacgttac ggtatcttac aaggtattga gtatgeaatt 

22681 tctgaagatg ccacgttaac gacgttacaa gcatcagatg cttctggcca accagtatca ttatttgaac 

22751 gtgatatgtt cgctttacgt gcgacgatgc atattgeata catgaaegtt aaaccagaag cgttcgcaac 

22821 gcttaaacca actgaatagg aggagatatg atggctaatc ctgeagaaga gattaaggta aaaaaagaca 

22891 atatgactat tactgttaca aagaaggcat ttgactctta ttacagtctt gtcggttaca aagaggttaa 

22961 ateaegtegt actaegtctg ataagagega gtgataaaaa tgactcttta tgaagatgtt aaacttttac 

23031 tcaagaaaaa tggagtggaa gttaaaagtg atgaagaaga aatatttaag atggaagttg aeggaatact 

23101 agaagatgtt agggatataa caaacaatga ttttatgaaa gatggtcaag tcatttatcc ttactcaatc 

23171 aaaaagtatg tegcagatgt cctagagtat tatcaacgac ctgaagttaa aaagaattta aagtcaagaa 

23241 gtatggggac agtgtcgtac acttataacg atggtgtccc tgattacatt agtggagtat taaacaggta 

23311 taaacgagca aagtttcatc cgtttaaacc aataaggtag aggtgttgtt tgtgtttaac ccatacgacg 

23381 aattccctca cactatttct attggaagta tcaaaaaagt aggagagtat ccaattatac aagagegett 

23451 tgtaagegat aaaacaatta aaggatttat ggatacgect actacatctg aacaactaaa atttcatcaa 

23521 atgtcacaag aatatgacag aaacctatat gtaccttatg acttgecaat atctaaaaac aatttatttg 

23591 agtatgaggg tagaatcttt agtattgaag gtgattctgt agatcagggc ggacaacatg aaattaagtt 

23661 actacgactt aagcaggtgc catatggcaa aagttaagta eggtgetgat agcatggttg ttgaattgga 

23731 taagttcgat aagaaaatag aagagtgggt taaaaaaggt attgetaaaa caacgacgaa gatttacaac 

23801 actgetgtag cattagctcc tgttgactta ggttttttag aagaaagtat tgactttaaa tatttcgatg 

23871 gtgggttatc cagtgttata agtgtcggcg cagattatgc aatatacgtt gaatacggta ctggtatata 

23941 tgctactggt cctggtggta gtcgtgctac aaagattccg tggagtttta aaggtgatga eggegaatgg 

24011 tacaccacat atggtcaagc gccacagcca ttttggaacc ctgcaattga cgcaggacgc aagacattcg 

24061 agcagtattt ttcatagagg tggttaaata tgtgggtatc agttgagcct gaacttacaa atcaaatata 

24151 taaaagatta atctcagacc ctaacattaa caaactagtt gatgataggg tttttgacgt tgttcaagat 

24221 gacgctgttt acccatatat tgttgtgggt gaatcaaacg tcactaacaa cgaatctagc gcaacaatga 

24291 gagaaacagt cggtattgtc atacatgtgt attcacagtt cgctacacaa tacgaggcta agctcatttt 

24361 aagegegata ggttatgtgc ttaacagacc tatagaaata gataactacg agtttcaatt tagcegtate 

24431 gatagtcaag cagtattccc tgatatagac aggtttacta agcatggcac gataeggett ttatttaagt 

24501 acagacataa aaagaaaaac gaaggagtgt attaaatggc gcaaaaaaac tatttagcag ttgtacgtcc 

24571 agctgaaact gactt agate cagtagaatc tttattatta gctgacttac aagaaggtgg acatacgatt 

24641 gaaaatgatt tagctgaaat agtacgaggc ggtaaaaegg actattctcc caatgeaatg tcagaatcat 

24711 ttaaattaac aattggtaat gtgcctggag ataaaggaat tgaagcagtg aaacacgctg tacaaacagg 

24761 tggacagttg cgtatatggc tttatgagcg caataaacgt geagaeggta aacatcaegg aatgtttggt 

24851 tatgttgttc cagaatcatt tgaaatgtca tttgatgatg aaagtgacaa aatcgaacta tcattaaaag 

24921 ttaaatggaa tacagcagaa ggtgctgaag ataacttgee gaaagagtgg tttgaagctg caggtgcgcc 

24 991 tacagttgaa tacgaaaaat teggegaaaa agteggaaca ttcgagaatc aaaagaaagc tagtgttgta 

25061 tctgattcac acaeggaaga ccattctatg taaactaata gatcaagggg gegtaagetc cctatttttt 

25131 tataaaaaaa ttgaaaagag gtatatattt tgactgaatt taatccaatt acaacattaa aaattaatga 

25201 eggagaaaaa gattacgaag tagaagcaaa agtaacattt gcatttgacc gaaaagcega aaaattctca 

25271 gaagatagcg aagatgggag aaaaggagca aegecaggat tcaatgttat etttaaeggt ttgctagaat 

25341 ctagaaacaa agegatttta caattttggg aatgtgctac tgectattta aaaaacccac caactcgaga 

25411 acaattagaa aaagcaattg atgat treat cactgaaaac gaggatactt tgccgttatt acaaggggct 

25481 ttggacaaac ttaacaatag tggtttttcc aagagggaga gtcgctcgta ctggatgaca ttgaacaaag 

25551 caccgaatat ggecaaaage gaggacaaag aaatgacgaa agcaggcata gaaatgatga aagagaatta 

25621 caaggaaatc atgggcgcag aaccttacac gattactcaa aaataaggca actgacagct agatatttag 

25691 gatatatccc tgaacacgaa ctgctagcac taacacctgc tgaatggcgt gattggctta ttggtggtca 

25761 ggataggtac ctagatcaaa gacaattatt aattgaacaa gegcaagcta aeggcttagt acaagcttct 

25831 aagaggctaa ctagt atgat tegtgacatt gagaaacaac gttacgaaat aagagaacct ggtagctatg 

25901 ctcgtgtaca aaaagctaga ttagaagaag aaaaaagaag aegtgaaetc ttcaaagaag gtacaagaaa 

25971 attccttgaa tcgaaaggag gtcagccttt ggatacccat tctatggcaa agattatggc caatattaga 
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26041 gatttccaaa gcaacgtaag gaaagctcaa cgattagcaa agacgtctgt accaaacgaa attgaaacag 

2 SI 11 atgtaaaagc agatatttca agattccaaa gagctttaca acgcgctaaa tcaacggctc aacgatggcg 

26181 agagcattct gttaaattat tcatgaaaac agatgagtat aaagcgaatt tagaacgcgc taaagctcaa 

26251 gtagagcgat ttaaacaaca taaagtagat ttgaaactaa gtaacactga atcaatggcc aaatataatg 

26321 caactaaagc tactgtcgaa gcttggagaa aacatgttgt taagttggat ttagatgcaa accccgctaa 

26391 aatggcggtt aaagggttta aagaagattt aatagatctt agcaggcata gttttgatat tgattccagc 

26461 agatggaaat caggaaataa attcacaaaa gaattcaatg aagtcgaagg agcagttaaa cgttctttcg 

26531 gaagaattgg tcagattatg agaaaagaag taaatggaac aagtgatatt tggggtaaac ttaacaactc 

26601 attgaaagat tacggcgaga aaatggacgc cttagctact aaaatccgaa ctttcggtac tatcttcgcg 

26671 caacaggtca aaggcttaat gattgctagt atacaagcat tgataccagt gattgccgga ttagtacctg 

26741 caataatggc agtacttaat gcggttggtg tattaggtgg tggcgtttta ggtttagttg gcgcattctc 

26811 tgtcgcaggt cttggagttg ttggctttgg tgcaatggct attagcgctc ttaaaatggt tgaagatgga 

26881 acattggcag taacaaaaga agttcaaaac tttagagatg cgagcgatca gttaaaaact acatggcgtg 

26951 atattgttaa agagaatcaa gcaagtatct ttaatgcgat gtcagcaggt atcagaggcg ttacaagtgc 

27021 gatgtctcaa ttaaaaccat tcttatccga agtatctatg ctagttgaag caaacgcacg cgagtttgag 

27091 aattgggtta aacattccga aacagctaag aaagcgtttg aagcattgaa tagcataggt ggcgcaatct 

27161 tcggagattt attgaacgct gcaggacgat ttggcgacgg attagttaac attttcactc aattaatgcc 

27231 gttgttcaaa tttgtgtctc aaggactaca gaacatgtct atagctttcc aaaattgggc taatagtgta 

27301 gctggtcaga atgctattaa agcgtttatt gactacacta ccactaactt acctaagatt ggtcagatat 

27371 ttggtaatgt gttcgctggt attggtaatt taatgattgc ttttgcacaa aacagttcca acatttttga 

27441 ttggttggtt aaattaactt ctcaatttag agcatggtca gaacaagtag gacaatcaca agggtttaaa 

27511 gactttatca gttatgttca agagaatggt cctactatta tgcagttaat cggtaatatc gtaaaagcat 

27581 tagttgcttt tggtactgca atggctccta tagctagtaa attgttagac tttatcacta atctagctgg 

27651 atttatcgcc aaactattcg aaacacaccc agctatagca caagttgctg gcgttatggg tattttaggc 

27721 ggtgtatttt gggctttaat ggctccgatt gttgctataa gtagtgtact tacaaatgtg tttggtttga 

27791 gcctattcag cgtcactgaa aagattttag acttcgttag aacatcaagt ttagttactg gagctacgga 

27861 agcattaata ggtgcattcg gttcgatttc agcacctatt ttagcagttg ttgcagtaat tggtgcattc 

27931 attggtgtcc tcgtttattt atggaaaaca aacgagaact ttagaaatac tattactgaa gcgtggaacg 

28001 gtgttaaaac ggcagtttct ggtgcgattc aaggtgtagt cggctggtta actgaattgt ggggcaaaat 

28071 ccaaectacc ttacaaccga taatgcctat attgcaagta ttaggacaaa tattcatgca agttttaggt 

28141 gttttggtaa taggcatcat tacaaacgtt atgaatatca tacaaggttt gtggacttta attacaattg 

28211 cgttccaagc cataggaaca gtgatatccg tagcagtcca aatcatagta ggtttgttca ctgctttaat 

28281 tcagttgctt actggcgact tctcaggtgc ttgggagact attaaaacta cggttaccaa tgtgcttgat 

28351 acgatttggc aatacatgca atcagtttgg gagtcaatta tcggcttttt aactggcgta atgaatcgaa 

2B421 cactttctat gtttggtaca agttggtcac agatatggag tacaatcact aattttgtta gcagtatttg 

28491 gaacactgtt acaagttggt tcagtcgagt ggcttcgagt gtagctgaaa aaatggggca agcactaaac 

28561 tttattatca caaaaggttc tgaatgggtt tctaacattt ggaatacagt tacaagtttc gcgagtaaag 

28631 tagctgatgg gtttaaaaga gttgtctcaa atgtaggtga cggtatgagt gatgcacttg gtaagattaa 

28701 aagtttcttc agtgatttct taaatgccgg agcggaatta atcggcaaag tagctgaggg tgtagccaaa 

28771 tctgcgcaca aagtagtcag cgcggtaggc gatgcgattt catcagcttg ggactctgta act teat teg 

28841 taagtggaca cggtggaggt agtagcttag gtaaaggttt ageggtatea caagcaaaag taattgetae 

28911 agactttggc agtgccttta ataaagagct atcctctact ttgacagata gtatagtaaa tcctgtaagt 

28981 acttctatag acagacacat gaetagegat gttcaacata gcttaaaaga aaataataga cctattgtga 

29051 atgtaacgat tagaaatgag ggcgaccttg atttaattaa ateaegcatt gatgacatga aegctataga 

29121 eggaagttte aacttattat aagggaggtt tgttagttga tagegcaega tatagaagta ataaggaatg 

29191 gttcacagta tegegtcagt gacaatcctt tcacttataa tcacttggaa gtagttgaat ataaegttae 

29261 aggegcagga tatcategta actattctga tatagagggt attgatggta gatttcataa ttacgctaaa 

29331 gaagaactta aaaaagtaga gcttaagata aggtataaag tacctaaaat tgettatget tcacatttaa 

29401 agtcagacgt ccaagcacta tttgctggac gtttttattt aagggaatta gctacaccag acaattcaat 

29471 taagtatgag cat at at tag atataccaaa agacaaacaa gcatttgagc ttgattatgt tgatggacga 

29541 caactttttg taggactagt aagtgaagtt tcttttgaca caacacaaac atcaggggaa ttttctttgt 

29611 cgtttgaaac aaccgaacta ccatactttg aaagtgtcgg ttatagtact gatcttgaaa gtaataacga 

29681 ccctgaaaaa tggteggtae ctgatagatt gcctacaaac gaaggtgata agaggegtea aatgacattt 

29751 tacaacacta actcaggaga agtttattat aacggtgatg ttcctttaac acagtttaat cagtttaatg 

29821 ttgttgaaat agagttagct gaagatgtta aagctaatga taaggatgga ttcactttct atacagataa 

29891 aggaaatatc tcagttatta aggaagttga tttaaaagee ggagataaaa taatcttcga eggtaaacat 

29961 acctatagag gttatttaaa tatagattct tttaataaaa ctttagaaca aceggtttta tatccaggct 

30031 ggaatcgatt caagtctaat aaagtaatga aacaaattac atttagacac aaattatatt ttagataagg 

30101 agtagectat gecaatttta ttaaaaagtc tacagggtgt agggcacget attaatgtta gtacaaaggt 

30171 aagtaaaaag ctaaatgaag atagttcttt ggatctaact attatcgaga aegegagtae gtttgacgea 

30241 ataggtgeta taactaaaat gtggacgatc actcatgttg aaggtgaaga tgatttcaac gaatatgtaa 

30311 ttgtcatact tgataagtct actattggcg aaaaaataag get t gat ate aaagctaggc aaaaagaact 

30381 tgatgacctt aacaattcta ggatttacca agagtataac gaaagtttta caggcgttga gttcttcaat 

30451 actgtcttta aaggaacggg ttataagtat gtattacatc caaaagtaga tgcatctaaa ttcgagggat 

30521 taggcaaagg agatacacga ttagaaatct ttaaaaaagg acttgagegt tatcatctcg aatatgaata 

30591 egatgeaaag actaaaacgt ttcatttgta tgatgaatta tctaagtttg ccaattatta cattaaagct 

30661 ggtgtgaatg ctgataacgt caaaatacaa gaagatgeat ctaaatgtta tacctttatt aaaggttatg 

30731 gtgattttga tggacaacag acttttgeag aagegggact acaaattgaa ttcactcatc cattagcaca 

30801 attgataggt aaaagagaag cgccaccgct tgttgatgga cgtattaaaa aagaagatag tttaaaaaaa 

30871 gcaatggagt tattgataaa gaaaagtgtc actgettcta tttccttaga ctttgtagcg ttacgtgaac 

30941 atttcccaga agctaaccct aaaataggtg atgttgttag agtggtggat tetgecatag gatataacga 

31011 cttagtgaga atagtcgaaa tcactacaca tagagatgeg tacaataata tcactaagca agatgtagta 

31081 ttaggagact ttacaaggcg taatcgttat aacaaagcag ttcatgatgc tgcaaattat gttaaaagcg 

31151 taaaatctac aaaatccgac ccatctaaag aactaaaagc attaaacgea aaagttaacg caagtttatc 

31221 tataaataat gaattggtta agcagaatga aaaaataaac gctaaagtcg ataagatgaa tactaaaaca 

31291 gttacaactg ctaatggtac gatcatgtac gactttacta gtcaatcaag tataagaaac atcaaatcaa 



WO 00/32825 



PCT/1B99/02040 



196 

31361 ttggaacgat tggcgactct gcagctagag ggtcgcacgc aaaaactaat ttcacagaaa tgttaggcaa 

31431 gaaattgaaa gctaaaacga ctaatcttgc aagaggtggc gcaacaacgg caacagttcc aataggtaaa 

31501 gaagcggtag aaaacagcat ttatagacaa gcagagcaaa taagaggaga cctaatcata ttacaaggca 

31571 ctgatgatga ctggttacac ggttattggg caggcgtacc gataggcact gataaaacgg atacaaaaac 

31641 gttttacggt gccttttgtt ctgcaactga agttattaga aagaataatc cagattcaaa aataccagtg 

31711 atgacagcta caagacaatg ccctacgagt ggtacaacaa tacgccgtaa agacacggac aaaaacaaac 

31781 tagggttaac acttgaggac tatgtaaacg ctcaaatatt agcttgtagt gagttagatg taccagtgtt 

31851 tgacgcatat cacacagatt actttaagcc atacaatcca gcttttagga aagcgagcat ggaggacggc 

31921 ttacacccta acgaaaaagg tcacgaggtt attatgtacg agttaatcaa ggattattac agtttttacg 

31991 actaaaggag gcaaccaatg gcttacggat taattacaag tttacattca atgacaggtc ggaaaatagt 

32061 tgctcaacat gagtataact atcgcttgtt agatgaaggt atgagcaaac ttgagaaaat gtttatatac 

32131 catcaaaaag aagaaatata cgcacactca gcgaaacaaa ttaaatactt gaatgacagt gttgaagatt 

32201 atttaacgta tttaaatagc cgttttagca atatgattct aggccataac ggcgacggta tcaatgaagt 

32271 aaaagacgcg cgtattgata atacaggtta tggtcataag acattgcaag atcgtttgta tcatgattat 

32341 tcaacactag atgctttcac taaaaaggtt gagaaagctg tagatgaaca ctataaagaa tatcgagcga 

32411 cagaataccg attcgaacca aaagagcaag aaccggaatt tatcactgat ttatcgccat atacaaatgc 

32481 agtaatgcaa tcattttggg tagaccctag aacgaaaatt atttatatga cgcaagctcg tccaggtaat 

32551 cattacatgt tatctagatt gaagcccaac ggacaattta ttgatagatt gcttgttaaa aacggcggtc 

32621 acggtacaca caatgcgtat agatacattg atggagaatt atggatttat tcagctgtat tggacagtaa 

32691 caaaaacaac aagtttgtac gtttccaata tagaactgga gaaataactt atggtaatga aatgcaagat 

32761 gtcatgccga atatatttaa cgacagatat acgtcagcga tttataatcc tatagaaaat ttaatgattt 

32831 tcagacgtga atataaagct Cctgaaagac aagctaagaa ttcattgaat ttcattgaag taagaagtgc 

32901 tgacgatatt gataaaggta tagacaaagt attgtatcaa atggatatac ctatggaata cacttcagat 

32971 acacaaccta tgcaaggtat cacttatgat gcaggtatct tatattggta tacaggtgat tcgaatacag 

33041 ccaaccctaa ctacttacaa ggtttcgata taaaaacaaa agaattgtta tttaaacgac gtatcgatat 

33111 tggcggtgtg aataataact ttaaaggaga cttccaagaa gctgagggtc tagatatgta ttacgatcta 

33181 gaaacaggac gtaaagcact tttaataggg gtaactattg gacctggtaa taacagacat cactcaattt 

33251 attctatcgg ccaaagaggt gttaaccaat tcttaaaaaa cattgcacct caagtatcga tgactgattc 

33321 aggtggacgt gttaaaccgt taccaataca gaacccagca tatctaagtg atattacgga agttggtcat 

33391 tactatatcc atacgcaaga cacacaaaat gcattagatt tcccgttacc gaaagcgttt agagatgcag 

33461 ggtggttctt ggatgtactg cctggacact ataatggtgc cctaagacaa gtacttacca gaaacagcac 

33531 aggtagaaac atgcttaaat tcgaacgtgt cattgacatt ttcaataaga aaaacaacgg agcatggaat 

33601 ttctgtccgc aaaacgccgg ttattgggaa catatcccta agagtattac aaaattatca gatttaaaaa 

33671 tcgttggttt agatttctat atcactactg aagaatcaaa acgatttact gattttccta aagactttaa 

33741 aggtattgca ggttggatat tagaagtaaa atcgaataca ccaggtaaca caacacaagt attaagacgt 

33811 aataacttcc cgtctgcaca tcaattttta gttagaaact ttggtactgg tggcgttggt aaatggagtt 

33881 tattcgaagg aaaggtggtt gaataatgat agtagataat ttttcgaaag acgataactt aatcgagtta 

33951 caaacaacat cacaatataa tccaattatt gacacaaaca tcagtttcta tgaatcagat agaggaactg 

34021 gtgttttaaa ttttgcagta actaagaata acagaccgtt atctataagt tctgaacatg ttaaaacatc 

34091 tatcgtgtta aaaaccgatg attataacgt agatagaggc gcctatattt cagacgaatt aacgatagta 

34161 gacgcaatca atgggcgttt gcagtatgtg ataccgaatg aatttttaaa acattcaggc aaggtgcatg 

34231 ctcaggcatt ctttacacaa aacgggagta ataatgttgt tgttgaacgt caatttagct tcaatattga 

34301 aaatgattta gttagtgggt ttgatggtat aacaaagctt gtttatatca aatctattca agatactatc 

34371 gaagcagtcg gtaaagactt taaccaatta aagcaagata tggatgatac acaaacgtta atagcaaaag 

34441 tgaatgatag tgcgacaaaa ggcattcaac aaatcgaaat caagcaaaac gaagctatac aagctattac 

34511 tgcgacgcaa actagtgcaa cacaagctgt tacagctgaa gtcgataaaa tagttgaaaa agagcaagcg 

34581 atttttgaac gtgttaacga agttgaacaa caaatcaatg gcgctgacct tgttaaaggt aattcaacaa 

34651 caaattggca aaagtctaaa cttacagatg attacggtaa agcaattgaa tcgtatgagc agtccataga 

34721 tagcgtttta agcgcagtta acacatctag gat tat teat attactaatg caacagatgc gecagaaaag 

34791 aeggatatag geaegttaga gaagcctgga caagatggtg ttgatgaegg ttcttcgttc gatgaatcaa 

34861 cttatacatc aagcaaatct ggtgtgttag ttgtttatgt tgttgataat aatactgetc gtgcaacatg 

34931 gtacccagac gattcaaacg atgagtacac aaaatacaaa atetaeggea catggtaccc gttttataaa 

35001 aagaatgatg gaaacttaac taagcaattt gttgaagaaa cgtctaacaa cgctttaaat caagctaagc 

35071 agtatgtaga tgataaattc ggaacaacga gctggcaaca acataagatg acagaggega atggtcaatc 

35141 aattcaagtt aacttaaata atgegcaagg cgatttggga tatttaactg ctggtaatta etatgeaaca 

35211 agagtgeegg atttaccagg tagtgttgaa agttatgagg gttatttatc ggtattcgtt aaagacgata 

35281 caaacaagct atttaacttc aegecttata actctaaaaa gatttacaca cgatcaatca caaaeggcag 

35351 acttgagcaa cagtggacag ttcctaatga acataagtca aeggtattgt tcgacggtgg agcaaatggt 

35421 gtaggtacaa caatcaatct aaccgaacca tacacaaact attctatttt attagtaagt ggaacttatc 

35491* caggtggcgt tattgaggga ttcggactaa ccacattacc taatgeaatt caattaagta aagcgaatgt 

35561 agttgactca gaeggtaacg gtggcggtat ttatgagtgt ttactatcca aaacaagtag cactacttta 

35631 agaatcgata acgatgtgta ctttgattta ggtaaaacat caggttctgg agegaatgee aacaaagtta 

35701 ctataactaa aattatgggg tggaaataat gaaaatcaca gtaaatgata aaaatgaagt tateggatae 

35771 gttaatactg gcggtttacg caatagttta gatgtagacg ataacaatgt gtctatcaaa ttcaaagaag 

35841 agttcgaacc tagaaagttc gttttcacta aeggegaaat taaatacaat agcaatttcg aaaaagaaga 

35911 cgtaccgaat gcatcaaacc aacaaagtgc gtcagattta agtgatgagg aacttcgegg aatggttgca 

35981 agtatgcaaa tgeagatgae gcaagtgaac atgttgacaa tgeaattgae gcaacaaaac gctatgttaa 

36051 cacaacagtt gaccgaactg aaaactaaca aaacaaatac tgagggggac gtttaaatga tgaagatgat 

36121 ttatccaact tttaaagaca ttaaaacttt ttatgtgtgg ggttgctata aaaatgagca aattaagtgg 

36191 tacgtagaca tgggtgtaat cgacaaagaa gaatatgeat tgatcactgg tgaaaaatat ccagaggcaa 

36261 aagatgaaaa gtcacaggtg taatgcttga ggctttttaa tttaacacaa agtaggtggc gtaatgtttg 

36331 gatttaccaa aeggcacgaa catgaatggc gaattagaag attagaagag aatgataaaa caatgettag 

36401 cactctcaat gagattaaat taggtcaaaa aactcaagag caagttaaca ttaaattaga taaaacttta 

36471 gat get at cc agagggaaag acagatagac gaaaaaaata agaaagaaaa cgacaaaaat atacgegata 

36541 tgaaaatgtg gattcteggt ttgataggga ctatcttcag tacgattgtc atagctttac taagaactat 

36611 ttttggtatt taaaggaggt gattaccatg cttaaaggga ttttaggata tagcttctgg gcgtgcttct 
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366B1 ggtttggtaa atgtaaataa cagttaagag tcagtgcttc ggcactggct ttttattttg attgaaatga 

36751 ggtgcataca tgggattacc taacccaaag actagaaagc ctacagctag tgaagtggtg gagtgggcaa 

36821 agtcgaatat tggtaagagg attaatatag ataattatcg gggcagtcaa tgttgggata cacctaactt 

36891 tatttttaaa agatattggg gttttgtaac atggggcaat gctaaggata tggctaatta cagatatcct 

36961 aagggtttcc gattctatcg ttattcatct ggatttgtac cggaacctgg agacatcgca gtttggcacc 

37031 ctggcaacgg aataggttcg gacggacaca ccgcaatagt agtaggacca tctaataaaa gttattttta 

37101 tagcgttgac caaaactggg ttaattctaa tagttggaca ggttctccag gaagattagt aagacaccct 

37171 tatgtaagtg ttacaggctt tgttaggcct ccatactcaa aagatactag caaacctagt agtactgata 

37241 caagttcagc atcaaaagcc aatgactcaa caattactgg cgaagcgaag aaaccgcaat ttaaagaagt 

37311 taaaacagta aaatacactg cttacagcaa tgttttagat aaagaagagc acttcattga tcatatagtt 

37381 gtaatgggtg atgaacgctc agatattcaa ggattatata taaaagaatc aatgcatatg cgttctgtag 

37451 acgaactgta tacgcaaaga aataagttta taagcgatta tgaaataccg catttatatg tcgatagaga 

37521 ggctacatgg cttgctagac caaccaattt tgatgacccg cgtcacccta attggctagt tattgaagta 

37591 tgtggtggtc aaacagatag caaacgacaa ttcttattga atcaaataca agcgttaata cgtggtgttt 

37661 ggttattgtc agggattgat aaaaacttat ctgaaacgac gttaaaggta gaccctaata tttggcgtag 

37731 tatgaaagat ttaattaact acgacttgat taagcaaggt ataccggaca acgcaaagta tgagcaagtt 

37801 aaaaagaaaa tgcttgagac atacattaaa cgagatatat tgacacgaga aaatataaaa gaagtaacga 

37871 caaaaacaac aataagaatt agtgataaaa catcagttga cagtgcgtcc acacgaggcc ctactccatc 

37941 agacgaaaaa ccaagcatcg ttactgaaac aagtccattc acattccagc aagcactgga tagacaaatg 

38011 tctaggggta acccgaaaaa atctcataca tggggctggg ctaatgcaac acgagcacaa acgagctcgg 

38081 caatgaatgt taagcgaata tgggaaagta acacgcaatg ctatcaaatg cttaatttag gcaagtatca 

38151 aggcatttca gttagtgcgc ttaacaaaat acttaaagga aaaggaacgc tcgacggaca aggcaaagca 

38221 ttcgcggaag cttgtaagaa aaacaacatt aacgaaattt atttgatcgc gcacgctttc ttagaaagtg 

38291 gatacggaac aagtaacttc gctagtggta gatacggtgc atataattac ttcggtattg gtgcattcga 

38361 caacgaccct gattatgcaa tgacgtttgc taaaaataaa ggttggacat ctccagcaaa agcaatcatg 

38431 ggcggtgcta gcttcgtaag aaaggattac atcaataaag gtcaaaacac attgtaccga attagatgga 

38501 atcctaagaa tccagctacc caccaatacg ctactgctat agagtggtgc caacatcaag caagtacaat 

38571 cgctaagtta tataaacaaa tcggcttaaa aggtatctac ttcacaaggg ataaatataa ataaagaggt 

38641 gtgtaaatgt acaaaataaa agatgttgaa acgagaataa aaaatgatgg tgttgactta ggtgacattg 

38711 gctgtcgatt ttacactgaa gatgaaaata cagcatctat aagaataggt atcaacgaca aacaaggtcg 

38781 tatcgatcta aaagcacatg gcttaacacc tagattacat ttgtttatgg aagatggctc tatattcaaa 

38851 aatgagcccc ttattatcga cgatgttgta aaagggttcc ttacctacaa aatacctaaa aaggttatca 

38921 aacacgctgg ttatgttcgc tgtaagctgt ttttagagaa agaagaagaa aaaatacatg tcgcaaactt 

38991 ttctttcaat at eg tt gat a gtggtattga atctgctgta gcaaaagaaa tcgatgttaa attggtagat 

39061 gatgetatta cgagaatttt aaaagataac gcgacagatt tattgagcaa agactttaaa gagaaaatag 

39131 ataaagatgt catttcttac atcgaaaaga atgaaagtag atttaaaggt gcgaaaggtg ataaaggega 

39201 acegggacaa cctggtgcga aaggtgatac aggtaaaaaa ggagaacaag gcgcacccgg taaaaacggt 

39271 actgtagtat caatcaatcc tgacactaaa atgtggcaaa ttgatggtaa agatacagat atcaaagcag 

39341 aacctgagtt attggacaaa atcaatatcg caaatgttga agggttagaa gataaattgc aagaagttaa 

39411 aaaaatcaaa gatacaactc tcaacgactc taaaaegtat aeggattcaa aaattgctga actagttgat 

39481 agcgcgcctg aatctatgaa tacattaaga gaattagcag aagcaataca aaacaactct atttcagaaa 

39551 gtgtattgca acagattggc tcaaaagtta gtacagaaga ttttgaggaa ttcaaacaaa cactaaacga 

39621 tttatatget ccaaaaaatc ataatcatga tgageggtat gttttgtcat ctcaagcttt tactaaacaa 

39691 caageggata atttatatca actaaaaagc gcatctcaac cgacggttaa aacttggaca ggaacagaaa 

39761 atgaatataa ctatatatat caaaaagacc ctaatacact ttacttaatt aaggggtgat ttttatggaa 

39831 ggtaatttta aaaatgtaaa gaagtttatt tacgaaggtg aagaatatac aaaagtatat gctggaaata 

39901 tccaagtatg gaaaaagect tcatcttttg taataaaacc cttacctaaa aataaatatc eggatagcat 

39971 agaagaatca acagcaaaat ggacaataaa tggagttgaa cctaataaaa gttatcaggt gacaatagaa 

40041 aatgtacgta gcggtataat gagggtttcg caaactaatt taggttcaag tgatttagga atatcaggag 

40111 teaatagegg agttgcaagt aaaaatatca actttagtaa tccctcaggg atgttgtatg tcactataag 

40181 tgatgtttat tcaggatccc caacattgac cattgaataa ttttaaacga ctaatttttt agtcgttttt 

40251 tattttggat aaaaggagca aacaaatgga tgcaaaagta ataacaagat acategtatt gatcttagca 

40321 ttagtaaatc aattcttagc gaacaaaggt attagecega ttccagtaga cgatgagact atatcatcaa 

40391 taatacttac tgttgttgct ttatatacta cgtataaaga caatccaaca tctcaagaag gtaaatgggc 

40461 aaatcaaaag ctaaagaaat ataaagctga aaacaagtat agaaaagcaa cagggcaagc gecaattaaa 

40531 gaagtaatga cacctacgaa tatgaacgac acaaatgate tagggtaggt gttgaccaat gttgataaca 

40601 aaaaaccaag cagaaaaatg gtttgataat tcattaggga agcagttcaa tcctgatttg ttttatggat 

40671 ttcagtgtta egattacgea aatatgtttt ttatgatagc aacaggegaa aggttacaag gtttataege 

40741 ttataatatt ccatttgata ataaagcaag gattgaaaaa tacgggcaaa taattaaaaa ctatgatagc 

40811 tttttacege aaaagttgga tattgtcgtt ttcccgtcaa agtatggtgg cggagctgga catgttgaaa 

40881 ttgttgagag cgcaaattta aacactttca cat cat at gg gcaaaattgg aatggtaaag gttggacaaa 

40951 tggcgttgcg caacctggtt ggggtcctga aactgttaca agacatgttc attattacga tgacccaatg 

41021 tattttatta gattaaattt cccagataaa gtaagtgttg gagataaagc taaaagcgtt attaagcaag 

41091 caactgccaa aaagcaagca gtaattaaac ctaaaaaaat tatgcttgta geeggtcatg gttataacga 

41161 tcctggagca gtaggaaacg gaacaaacga aegegatttt ateegtaaat atataacgee aaatatcget 

41231 aagtatttaa gaeatgeagg tcatgaagtt gcattatatg gtggctcaag tcaatcacaa gacatgtatc 

41301 aagatactgc atacggtgtt aatgtaggaa ataataaaga ttatggatta tattgggtta aatcacaggg 

41371 gtatgacatt gttctagaga ttcatttaga cgcagcagga gaaaatgcaa gtggtgggca tgttattatc 

41441 tcaagtcaat teaatgegga tactattgat aaaagtatac aagatgttat taaaaataac ttaggacaaa 

41511 taagaggtgt aacacctcgt aatgatttac tgaacgttaa tgtatcagca gaaataaata tcaattatcg 

41581 tttatctgaa ttaggtttta ttactaataa aaaagatatg gattggatta agaagaatta tgacttgtat 

41651 tctaaattaa tagctggtgc gattcatggt aagectatag gtggtttggt agctggtaat gttaaaacat 

41721 cagctaaaaa ccaaaaaaat ccaccagtgc cagcaggtta tacacttgat aagaataatg tgccttataa 

41791 aaaagagact ggtaattaca cagttgccaa tgttaaaggt aataacgtaa gggaeggcta ttcaactaat 

41861 tcaagaatta caggtgtatt acctaataac gcaacaatca aatatgaegg cgcatattgc atcaatgggt 

41931 atagatggat tacttatatt gctaatagtg gaeaaegteg etatattgeg acaggagagg tagataaagc 



WO 00/32825 



PCT/IB99/02040 



42001 aggtaatagg ataagtagtt ttggtaagtt 

42071 cattaattat agggaatctt acagttatta 

42141 ttaacattac tctcaagatt taaatgtaga 

42211 taatgtaatt acattaccag taaccaatct 

42281 gaggacttac ttgcgtaaag tagtaagaag 

42351 gttgtttttt atgttatatt ataaatgatc 

42421 tatgcaaaaa aaacgaaaaa aagttcataa 

42491 ataccagttg agaggaggat aaaaagtgtt 

42561 atgtcagcaa ttgccatagc gaaaacattg 

42631 tctatatata aattctaaca ctaaaatact 

42701 taaacgtgtt tttaggcaac gatataagta 

42771 tttatggaag agggataaaa atgacagcaa 

42841 agaaacggga tataaaattg ctaaaaattc 

42911 aaaacatctt tatcagatgc cagatttaga 

42981 acgaagaaga taaataaaag gagccaaaaa 

43051 aaagaagtat ttgaatcagg taaaaacttt 

43121 atgatagata cgtagtactt gaccataaaa 

43191 caaaagaaaa ttagtaagtt aaataattag 

43261 cgcgtgtcaa atacgtgtca atttagttct 

43331 cgcatagtta taggcttttc agccatatac 

43401 tggaaacctt gatttaatgg ggttttaatc 

43471 cacgttgacc ttgctctttt ttatgttcat 

43541 ataatggcct aatcttttgc taatatattc 



198 

tagcacgatt tagtatttac ttagaataaa aattttgcta 
aataactatt tggatggatg ttaatattcc tatacacttt 
taacaggcag gtactacggt acttgcctat ttttttgtta 
ggcttaaaac cacatttccg gtagccaatc cggctatgca 
ctgactgcat atttaaacca cccatactag ttgctgggtg 
aaaccacacc acctattaat ttaggagtgt ggttattttt 
aaagtattgc atatcacgtt taaccgtgtt ataataaggt 
agaaaatttt aaaactatag cagaaatcgc cttttataca 
aaaaaagacg ataagtaagt agacaagccc gaaagggctg 
atgaaaacaa tttacattat tttaatcatt cttatttgga 
aaagtgttgt tgcactgctt actactttac tgcttatcaa 
taaaagaaat aattgaatca atagaaaagt tattcgaaaa 
cggattacca tatcaaactg tgcaagattt aagaaatgga 
acgataataa agttatacga gtatcaaaga tcgcttgaaa 
tatgtttgtt acaaaagaag aatttaaaac tttgaatgta 
ataaaaatta cagatggaag acatgcaata tattgggtaa 
aaggcgattt gtacccgcaa aaagcatacc caaaatatat 
aaaaccacgt cttaattgac gtggttattt tttaggtttg 
atttctttag ttttctttct aaacttaatt gcttgtaaac 
caagataaga tttatcccgc cgtctccata aaaatatgct 
tagcaagtgt caaatatgtg tcaagaaaat aattttctga 
caagtaagtg agagtaggtg tctaaagtta tagatatatt 
aatagg 
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Table 10 



Bacteriophage 96 ORFs list 



SID 


LAN 


FRA 


POS 


a .a . 


RBS sequence 


STA 


STO 


100733 


96ORF001 


1 


25999 . . 29142 


1047 


ccttgaatcgaaaggaggttagcct 


ttg 


' taa 


100734 


96ORF002 


1 


32008 . . 33906 


632 


1 1 1 1 1 acgac t aaaggaggc a ac ca 


atg 


taa 


100735 


96ORF003 


1 


30109 . . 31995 


628 


ttatattttagataaggagtagcct 


atg 


taa 


100736 


96ORF004 


1 


36760 . . 38634 


624 


at t t t gat tgaaat gagg t gca t a c 


atg 


taa 


100737 


96ORF005 


3 


a "> a A "i 1 r '"I A a 

33903 . .35729 


608 


gtttattcgaaggaaaggtggttga 


at a 


taa 


100738 


96ORF006 


2 


A AC a A A A A ^ 


484 


aatgatttagggtaggtgttgacca 


atg 


tag 


100739 


96ORF007 


1 


18652 . .20091 


4 79 


tatacacacatactaaacctgaacg 


att 


tga 


100740 


96ORF008 


2 


B9o0 . . 1U2U1 


413 


tggcagaatttgggggcgataacga 


atg 


tga 


100741 


96ORF009 


2 


1 /44 / . . lob /U 


— i£Z 


gacgcaat aacggaagt gat cgt ca 


atg 


tga 


100742 


96ORF010 


1 


~5 Q t~ A -1 ^ AAi A 

jo64 / . . J?oi7 


390 


taaatataaataaagaggtgtgtaa 


atg 


tga 


100743 


96ORF011 


-1 


119 . • 1195 


358 


gtagctcgcctacccttattatttt 


ttg 


tga 


100744 


96ORF012 


2 


*\AA A C 11 Ai n 

2004b. . JlulJ 


322 


tttaatgacaaattacctgacatag 


atg 


tga 


100745 


96ORF013 


3 


29157 . .30098 


313 


ac 1 1 a 1 1 a t a agggaggt 1 1 gt t ag 


ttg 


taa 


100746 


96ORF014 


1 


21925 . .22839 


304 


agaaaataaagtgaggtaataaaat 


atg 


tag 


100747 


96ORF015 


1 


EOT** fCOl 

5B12 . . boy 1 


259 


a t a ca cggt aaagg t gggagaa t ag 


atg 


taa 


100748 


96ORF016 


1 


7852 . . 8607 


251 


aataaaatgttgaaaggagagaaaa 


atg 


taa 


100749 


96ORF017 


3 


3444 . . 4190 


24 8 


aaatttaacattaatatcactttaa 


gtg 


taa 


100750 


96ORF018 


-3 


2o2al . . 29UUU 


239 


taagctatgttgaacatcgctagtc 


atg 


tga 


100751 


960RF 019 


3 


ttoo toco 


223 


tttaccgttctaggacgtggtttaa 


atg 


taa 


100752 


96ORF020 


3 


21324 . . 219UH 


194 


gaagggcaaaaaggagttttgatat 


atg 


taa 


100753 


96ORF021 


3 


6612 . .7175 


187 


attaaaaattaat taaaaggacggt 


ata 


tag 


100754 


96ORF022 


2 


^ A C*i £ 1 C A A^ 


185 


aaagaaaaacgaaggagtgtatt aa 


atg 


taa 


100755 


96ORF023 


1 


5275 . .5811 


178 


catgaaatggtaggaggtatgaaaa 


gtg 


tag 


100756 


96ORF024 


3 


144B1 . . 13014 


177 


t aaaacgat aggaga t aacgaa t aa 


atg 


taa 


100757 


96ORF025 


2 


25157 . .25666 


169 


ataaaaaaattgaaaagaggtatat 


att 


taa 


100758 


960RF026 


-3 


150S4 . . 15590 


168 


tcattcttaacatagcccttaattc 


atg 


tga 


100759 


96ORF027 


-1 


1229 . .1732 


167 


aatagcaaataaaggagtgtaaaac 


atg 


taa 


100760 


96ORF028 


1 


16960 . . 17454 


164 


aaggcgtgtgatacagtgaaaacaa 


ttg 


taa 


100761 


96ORF029 


-1 


1736 . . 2227 


163 


tatgagaaaaggagtcatataaaag 


atg 


taa 


100762 


96ORF030 


1 


25531 . . 25995 


154 


ttttcaagagggagagtcgctcgta 


ctg 


tag 


100763 


96ORF031 


2 


23633 . . 24097 


154 


tttagtattgaaggtgattctgtag 


ate 


tag 


100764 


96ORF032 


-2 


2248 . .2706 


152 


ataagacaccaaaggggtttggcgc 


atg 


tga 


100765 


96ORF033 


-3 


39147 . . 39605 


152 


age at a t aaat cgt 1 1 agt gt 1 1 g t 


ttg 


taa 


100766 


9oURr 0 J4 


2 


131B1 . . Uolb 


144 


tagaagtcgaaaaagtggaggcaat 


ata 


taa 


100767 


9 6 UKr Qjb 


2 


■n a A o iiAFi 

1062B . . 11053 


141 


gagctaggattgcaagcaacgatat 


ttg 


tga 


100768 


ybUKrUJo 


2 


24 11U . . 24 b J b 


141 


gtatttttcatagaggtggt taaat 


atg 


taa 


100769 


960RF 037 


1 


1 O C Q 1 1 O ft O £ 

12bb3 . . 1299b 


137 


atgaggaacagaagcaaccaacttt 


att 


tga 


1 A A *T *7 A 


a £ rvo v a i a 


1 


lbo20 . . IbU J2 


134 


atgttaagaatgatgcctagtttaa 


ttg 


taa 


100771 


96ORF039 


3 


^ A O 1 ^ >♦ A A A A 

39816 . .40220 


134 


ctaatacactt tacttaattaaggg 


gtg 


taa 


100772 


96ORF04 0 


-3 


27528 . . 2 7932 


134 


tttccataaataaacgaggacacca 


atg 


tga 


100773 


j o vivr w t» x 


3 


IbaUS • •iaou f 


133 


y«»-y "yyy **yy a yy **y *»*»**y c*y w«y 


atg 


tga 


100774 


96ORF042 


2 


35720. .36106 


128 


aagttactataactaaaattatggg 


gtg 


taa 


100775 


96ORF043 


-2 


35713 . .36081 


122 


ttaaacgtccccctcagtatttgtt 


ttg 


taa 


100776 


96ORF044 


-2 


9460. .9828 


122 


agtatccatcagttgaagataatct 


ata 


taa 


100777 


96ORF04 5 


-3 


5139. .5504 


121 


ttctttttgtattctgtaatattca 


att 


tga 


100778 


96ORF046 


2 


11513. .11872 


119 


aagtaaatgtatagaggtggaataa 


atg 


taa 


100779 


96ORF047 


*2 


22991. .23350 


119 


gtcgtactacgtctgataagagcga 


gtg 


tag 


100780 


96ORF046 


3 


8607. .8963 


118 


tggaaaaagaattgagtgatgacta 


atg 


tga 


100781 


96ORF049 


1 


23353 . .23697 


114 


atccgtttaaaccaataaggtagag 


gtg 


taa 


100782 


96ORF050 


-2 


2728. .3072 


114 


tggtaaattagtattacattaagta 


ata 


taa 


100783 


96ORF051 


3 


4692. .5021 


109 


tcaaaatatacggaggtagtcaact 


atg 


tga 


100784 


96ORF052 


-1 


20882. .21211 


109 


gtagcaaagagacaactaaaaaagt 


gtg 


taa 


100785 


96ORF053 


1 


40252. .40578 


108 


acgactaattttttagtcgtttttt 


att 


tag 


100786 


96ORF054 


1 


4942. .5262 


106 


aat at aaaact aaaaaacaaaat 1 1 


atg 


tag 


100787 


96ORF055 


-2 


4840. .5151 


103 


ccgtcgcaatatatagttcgcttaa 


ate • 


taa 


100788 


96ORF056 


3 


36324. .36623 


99 


aatttaacacaaagtaggtggcgta 


atg 


-taa* 


100789 


96ORF057 


2 


1394 . .1690 


98 


cttcagtggctcttttagcatttaa 


ata* 


taa 


100790 


96ORF058 


-3 


26247. .26537 


96 


tacttcttttctcataatctgacca 


att 


tga 


100791 


96ORF059 


-1 


21485. .21772 


95 


agact caacgcc 1 1 1 1 tg aacat ac 


ttg 


tga 


100792 


96ORF060 


-3 


22647. .22931 


94 


cctctttgtaaccgacaagactgta 


ata 


taa 


100793 


96ORF061 


1 


14023. .14304 


93 


ttatctaatta aggggga eg agt g a 


gtg 


taa 


100794 


96ORF062 


-2 


38281. .38559 


92 


tatataacttagcgattgtacttgc 


ttg 


taa 
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10079S 


96ORF063 


-3 


30786 . .31064 


92 


gtctcctaatactacatcttgctta 


gtg 


tga 


100796 


96ORF064 


-2 


30205. .30480 


91 


atgcatctact,tttggatgtaatac 


ata 


tag 


100797 


96ORF065 


1 


2617 . .2886 


89 


aaggtctaataaaaatttctccttc 


ctg 


taa 


100798 


96ORF066 


3 


28056 . .28325 


89 


aaggtgtagtcggctggttaactga 


att 


taa 


100799 


96ORF067 


-3 


17142. .17411 


89 


ttccgttattgcgtcgtgaagttgt 


ttg 


tga 


100800 


96ORF068 


2 


12326 . .12589 


87 


aatgcatgtcgtttggtctgcctaa 


ttg 


tag 


100801 


96ORF069 


2 


42734 . .42997 


87 


tttttaggcaacgatataagtaaaa 


gtg 


taa 


100802 


96ORF070 


1 


11869. .12129 


86 


aaatgttcaagaaatggagtgaagc 


ata 


taa 


100803 


96ORF071 


3 


15396 . .15656 


86 


aacaagctatacaaattatcgacaa 


att 


taa 


100B04 


96ORF072 


-3 


37749. .38009 


86 


agattttttcgggttacccctagac 
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att 


tga 


101135 


96ORF403 


-3 


38607. .38708 


33 


atcttcagtgtaaaatcgacagcca 


atg 


tag 


101136 


96ORF404 


-3 


21288. .21389 


33 


cagacaccgtcttaagtccctttag 


ata 


taa 
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Table 11 



SEQUENCE INFORMATION FOR PHAGES MATCHING WITH TABLE 1 

M32695 

Bacteriophage PM2 nuclease cleavage site 
gi|166145|gb|M32695|BM2NCS [166145] 

(View GenBank report,FASTA report^ASN.l report,Graphical view,l MEDLINE link, or 1 nucleotide neighbor ) 
M32693 

Bacteriophage PM2 Hind III fragment 4 
gi|166144|gb[M32693|BM24HIND3 [166144] 

(View GenBank report,FASTA report,ASN.l report,Graphical view.l MEDLINE link, or 1 nucleotide neighbor ) 
M32693 

Bacteriophage PM2 Hind III fragment 4 
gi|l66144!gb|M32693|BM24HIND3 [166144] 

(View GenBank report,FASTA report^ASN.l report,Graphical view,l MEDLINE link, or 1 nucleotide neighbor ) 
M32694 

Bacteriophage PM2 Hind III fragment 3 
gi|l66l43igb|M32694|BM23HIND3 [166143] 

(View GenBank report,FASTA report^SN.l report,Graphical view, or 1 MEDLINE link ) 
M26134 

Bacteriophage PM2 structural protein gene containing puiine/pyrimidine rich 
regions and anti-Z-DNA-IgG binding regions, complete cds 
gi]289360|gb|M26l34|BM2PROTIV [289360] 

(View GenBank report,FASTA repoiVASN.l report,Graphical view.l MEDLINE link, or 1 protein link ) 
J02452 

bacteriophage fi 3'-terminal region ma 
gi|215409|gb|J02452|PFITR3 [215409] 

(View GenBank report,FASTA reporCASN.l report,Graphical view, or 1 MEDLINE link ) 
AF020798 

Bacteriophage Chpl genome DNA, complete sequence 
gi[21776l|dbj|D00624|BCPl [217761] 

(View GenBank report,FASTA repor^ASN. I report, Graphical view.l MEDLINE link, 12 protein links, or 1 genome link ) 
X72793 

Clostridium botulinum C phage BONT/Cl, ANTP-139, ANTP-33, ANTP-17, ANTP-70 
genes and ORF-22 

gi|5l6171|emb|X72793|CBCBONT [516171] 

(View GenBank report,FASTA report^SN.l report,Graphical view, I MEDLINE link, 6 protein links, or 4 nucleotide neighbors ) 
X51464 

Clostridium botulinum D Phage C3 gene for exoenzyme C3 
gi|l4907|emb|X51464|CBDPE3 [14907] 

(View GenBank report,FASTA report^SN.l report,Graphical view,! MEDLINE link, 1 protein link, or 2 nucleotide neighbors ) 
D90210 

Bacteriophage c-st (from C. botulinum) Cl-tox gene for botulinum CI neurotoxin 
gi|217780|dbj|D90210|CSTClTOX [217780] 

(View GenBank report t FASTA report^ASN.l report,Graphical view, 1 MEDLINE linV, or 1 protein link ) . _ * r 
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S49407 

type D neurotoxin [bacteriophage d-16 phi, host = C. Botulinum, type D, CB16, Genomic, 4087 nt] 
gi|260238|gb|S49407|S49407 [260238] 

(View GenBank report,FASTA report,ASN.l report,Graphical view,l MEDLINE link, or 1 protein link ) 
X53370 

Bacteriophage phi29 temperature sensitive mutant TS2(98) DNA polymerase gene 
gi|15733|emb|X53370|POTS298 [15733] ' 

(View GenBank report,FASTA report,ASN.l report,Grapbical view,l MEDLINE link, 1 protein link, or 7 nucleotide neighbors ) 
X53371 

Bacteriophage phi29 temperature sensitive mutant TS2(24) DNA polymerase gene 
gi|15731|emb|X53371|POTS224 [15731] 

(View GenBank report.FASTA report,ASN.l report,Graphical view,l MEDLINE link, 1 protein link, or 7 nucleotide neighbors ) 
X05973 

Bacteriophage phi29 prohead RNA 
gi|15680|emb|X05973|POP29PRO [15680] 

(View GenBank report,FASTA report^ASN.l report,Graphical view,2 MEDLINE links, or 4 nucleotide neighbors ) 
V01 155 

Left end of bacteriophage phi-29 coding for 15 potential proteins Among 

these are the terminal protein and the proteins encoded by the genes 1, 2 (sus), 3, and (probably) 4 

gi|15659|emb|V01 155|POP29B [15659] 

(View GenBank report,FASTA report^SN.l report 1 Graphical view, I MEDLINE link, 16 protein links, or 16 nucleotide neighbors) 
X73097 

Bacteriophage phi-29. left origin of replication 
gi|312194|emb|X73097|BP29ORIL [312194] 

(View GenBank report,FASTA reporv\SN.l report,GraphicaI view.l MEDLINE link, or 5 nucleotide neighbors ) 
M14430 

Bacteriophage phi-29 gene- 17 gene, complete cds 
giI215321|gb|M14430IP29G17A [215321] 

(View GenBank report,FASTA report^ASN.l report,Graphical view, I MEDLINE link, 6 protein links, or 8 nucleotide neighbors ) 
M 14431 

Bacteriophage phi-29 gene- 16 gene, complete cds 
giJ2l5319!gb|M14431|P29G16A [215319] 

(View GenBank report,FASTA report^ASN.l report,Graphical view,! MEDLINE link, 2 protein links, or 7 nucleotide neighbors ) 
M20693 

Bacteriophage phi-29 DNA, 3* end 
gi|215343|gb|M20693(P29REPINB [215343] 

(View GenBank report,FASTA report^SN. 1 report,Graphical view, 1 MEDLINE link, or 4 nucleotide neighbors ) 
M21016 

Bacteriophage phi-29.DNA, 5* end 
gi|215342|gb|M21016|P29REPINA [215342] 

(View GenBank report^ASTA report^ASN.l report,Graphical view,l MEDLINE link, or 1 nucleotide neighbor ) 
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MI 2456 

Bacteriophage phi-29 genes 9, 10 and 11 encoding p9 tail, incomplete, plO 
connector, complete, and pi 1 lower collar, incomplete, respectively 
gi|215338|gb|M12456|P29P9 [215338] 

(View GcnBank report,FASTA repon,ASN. 1 report,Graphica! view, 1 MEDLINE link, 3 protein links, or 2 nucleotide neighbors ) 
M 14782 

Bacillus phage phi-29 head morphogenesis, major head protein, head fiber 
protein, tail protein, upper collar protein, lower collar protein, pre-neck* 

appendage protein, morphogenesis(13), lysis, morphogenesis(15), encapsidation genes, complete cds 
gi|2 1 5323 |gb|M 1 4782|P29LATE2 [2 1 5323 ] 

(View GcnBank report,FASTA repon,ASN. 1 report,GraphicaI view, 1 MEDLINE link, 1 1 protein links, or 1 1 nucleotide neighbors) 
M2696S 

Bacteriophage phi-29 (from Bacillus subtilis) proteins pi delta- 1 genes, comoiete cds, and the susl(629) mutation 
gi|341558|gb|M26968|P29P!DlA [341558] 

(View GenBank report,FASTA report,ASN. 1 report,Graphical view, 1 MEDLINE link, 2 protein links, or 1 nucleotide neighbor ) 
J02448 

Bacteriophage n , complete genome 
gil 166201 |gb| J02448|F1 CCG [166201] 

(View GenBank report,FASTA report,ASN. 1 reportfGraphical view, 1 MEDLINE link, 10 protein links, 205 nucleotide neiehfaors 
or 1 genome link ) 

M24832 

Bacteriophage f2 coat protein gene, partial cds 
gi!166228jgb|M24832|F2CRNACA [166228] 

(View GenBank report,FASTA report,ASN. 1 report,Graphical view, 1 MEDLINE link, 1 protein link, or 4 nucleotide neighbors ) 
J02451 

Bacteriophage fd, strain 478, complete genome 
gi|2 l5394|gb|J0245 1 |PFDCG [2 15394] 

(View GenBank report,FASTA reportASN. 1 report,Graphical view f 5 MEDLINE links, 10 protein links, 204 nucleotide neighbors 
or I genome link ) 

M34834 ' 
Bacteriophage fr replicase gene, 5 f end 
gi| 1 66 1 39|gb|M34834|BFRREGRA [166139] 

(View GenBank report,FASTA report^SN.l report,Grapbical view,l protein link, or 9 nucleotide neighbors ) 
M38325 

Bacteriophage fr replicase gene, 5* end 
gill66137|gb[M38325|BFRREGR [166137] 

(View GenBank rcport^ASTA report^ASN.l report,Graphical view,l protein link, or 9 nucleotide neighbors ) 
M35063 

Bacteriophage fr coat protein replicase cistron (R region) RNA 
gi|I66134|gb|M35063|BFRRCRRA [166134] 

(View GenBank report,FASTA report,ASN.l report,Graphical view,l protein link, or 3 nucleotide neighbors ) 
S66567 

alpha-atrial natriuretic factor/coat protein=fusion polypeptide [human, 
bacteriophage fr, expression vector pFAN 15, PlasmidSyntheticRecombinant, 510nt] 
gi|435742|gb|S66567|S66567 [435742] 

(View GenBank report,FASTA reportASN.l report,Graphical view,l MEDLINE link, 1 protein link, or 15 nucleotide neighbors ) 
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X15031 

Bacteriophage fr RNA genome 

gi| 15071 |emb|X 1 503 1 |LEBFRX [15071] 

(View GeaBank report.FASTA report.ASN.l repon,Graphical view t l MEDLINE link, 4 protein links, 9 nucleotide neiehbors, 
or I genome link ) 

U51233 

Mus musculus neutralizing anti-RNA-bacteriophage fr immunoglobulin variable 
region light chain (IgM) mRNA, partial cds 
gi| 1 277 1 50Igb|U5 1 23 3|MMU5 1 233 [ 1 277 1 50] 

(View GenBank repon,FASTA report,ASN\ 1 report, Graphical view,l protein link, or 1669 nucleoride neighbors ) 
U5I232 

Mus muscuius neutralizing ami-RNA-bacteriophage fr immunoglobulin variable resion heavy chain (IgM) mRNA partial cds 
gi|1277148|gbiU51232|MMU5 1232 [1277148] 

(View GenBank report.FASTA report,ASNU report,GraphicaI view.l protein link, or 1073 nucleoride neighbors ) 
U02303 

Bacteriophage Ifl, complete genome 
gi|3676280|gb|U02303|B2U02303 [3676280] 

(View GenBank rcport,FASTA report, ASN.l report.Graphical view, 10 protein links, or 1 genome link ) 

V00604 
Phage M13 genome 

gqi4959|emb|V006041INM13X [14959] 

(View GenBank report,FASTA report^ASN. 1 report,Graphical view,l MEDLINE link, 10 protein links, or 205 nucleoride 
neighbors ) 

A32252 

Synthetic bacteriophage M13 protein III probe 
gi|1567340|emb|A32252|A32252 [1567340] 

(View GenBank report,FASTA report^ASN.l report, or Graphical view) 
A32251 

Synthetic bacteriophage M13 protein III probe 
gi|1567339|emb|A32251|A32251 [1567339] 

(View GenBank report,FASTA report^SN.l report, or Graphical view) 
MI2465 

Bacteriophage Ml 3 mplO mutations in lac operon 
gi|215210|gb|M12465|M13LACMUT [215210] 

(View GenBank report,FASTA reportjASN.l report,Graphical view,l MEDLINE link, or 215 nucleoride neighbors ) 
M24177 

Synthetic Bacteriophage M13 (clone M13.SV.B12) SV40 early promoter region DNA 
gi|2094 1 6|gb|M24 1 77|S YNS VB 1 2 [2094 1 6] 

(View GenBank report,FASTA report,ASN.l report,Graphical view.l MEDLINE link, or 1 nucleotide neighbor ) 
M24I76 

Synthetic Bacteriophage M13 (clone MI3.SV.B1 1) SV40 early promoter region DNA 
gi|2094 1 5|gb|M24 1 76iS YNS VB 1 1 [2094 1 5] 

(View GenBank repor^FASTA report,ASN. 1 report,Graphical view, I MEDLINE link, or 1 nucleotide neighbor ) 
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M24175 

Synthetic Bacteriophage M13 (clone M13.SV.8) SV40 earlv promoter region DNA 
gi|208806|gb|M24175|SYNMI3SV8 [208806] 

(View GenBank report.FASTA repon.ASN.1 report,Graphical view.l MEDLINE link, or 242 nucleotide neighbors ) 
Ml 9979 

Synthetic hybrids; recombinant DNA from bacteriophage M13 and plasmid dHV33 
gi|207813|gb|M19979|SYN33M13M [207813] 

(View GenBank report, FASTA rcport,ASN.l report,Graphical view.l MEDLINE link, or 617 nucleotide neighbors ) 
Ml 9565 

Synthetic hybrids; recombinant DNA from bacteriophage M13 and plasmid oHV33 
gi|207808|gb|MI9565|SYN33M13H [207808] P 

(View GenBank report.FASTA report.ASN.l report.Graphical view, 1 MEDLINE link, or 567 nucleotide neighbors ) 
Ml 9564 

Synthetic hybrids; recombinant DNA from bacteriophage M13 and plasmid dHV33 
gi|207807|gb|M19564|SYN33M13G [207807] 

(View GenBank repon,FASTA report^SN.l report,GraphicaI view.l MEDLINE link, or 12 nucleotide neighbors ) 
MI9563 

Synthetic hybrids; recombinant DNA from bacteriophage M13 and plasmid dHV33 
gi|207806|gb|M19563|SYN33M13F [207806] ~ 

(View GenBank report,FASTA report,ASN.l report,Graphical view, 1 MEDLINE link, or 262 nucleotide neighbors ) 
M19561 

Synthetic hybrids; recombinant DNA from bacteriophage M13 and plasmid pHV33 
gi|207804|gb|M1956I|SYN33M13D [207804] 

(View GenBank repon,FASTA reporVASN.l report,Graphical view,l MEDLINE link, or 27 nucleotide neighbors ) 
M19560 

Synthetic hybrids; recombinant DNA from bacteriophage M13 and plasmid pHV33 
gi|207803|gb|M 1 9560|SYN33M 1 3C [207803] 

(View GenBank report,FASTA report,ASN.l report,Graphical view, or I MEDLINE link ) 
MI 9559 

Synthetic hybrids; recombinant DNA from bacteriophage M13 and plasmid pHV33 
gi|207802!gb|M 1 9559JSYN33M 13B [207802] 

(View GenBank report,FASTA repor^ASN.l report,Graphical view.l MEDLINE link, or 227 nucleotide neighbors ) 
M10568 

Bacteriophage M 13 replicative form H, replication origin, specific nick location 
gi|2 1 5220|gb|M 1 0568|M 1 30RIB [2 1 5220] 

(View GenBank reportJASTA rcport^SN. 1 report,Graphical view, 1 MEDLINE link, or 650 nucleotide neighbors ) 
M10910 

Bacteriophage M13 gene II regulatory region and M13sjl mutant 
gi|215209lgb|Ml0910|Ml3IIREG [215209] 

(View GenBank rcport,FASTA report,ASN. 1 report,GraphicaI view, 1 MEDLINE link, or 72 nucleotide neighbors ) 
M38295 

Bacteriophage M13 HaeDI restriction fraementDNA 
gi|215208|gb|M38295iM13HAEIU [215208] 

(View GenBank report,FASTA report,ASN.l report,Graphical view, or 67 nucleotide neighbors ) 
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E02067 - 10 
DNA cncodine a pan of Bacteriophage Ml 3 te 127 
gi|2 1 703 1 1 |dbj|E02067|E02067 [21703 1 1] 

(View GenBank report.FASTA repon,ASN. I report, or Graphical view) 
J02467 

Bacteriophage MS2, conralete genome 
gi|215232|gb|J02467|MS2CG [215232] 

(View GenBank report.FASTA repoaASNM repon,GraphicaI view,8 MEDLINE links, 4 protein links. 20 nucleotide neiehbors 
or 1 genome link ) s . 

AJ004950 

Bacteriophage PI ban gene 
gii3688226|emb!AJ011592|BP101 1592 [36882261 

(View GenBank report.FASTA repoaASNM report,Graphical view, or 1 protein link) 
U88974 

Bacteriophage PI structural lyric transeiycosylase (orf47), pep44b (orf44b), 

pep44a (orf44a) t and pep43 (orf43) genes, complete cds; and pep42 (orf42) gene, partial cds 

giJ2661099|gb|AF035607|AF035607 [2661099] 

(View GenBank report,FASTA report,ASN.l report,Graphical view,5 protein links, or 1 nucleotide neighbor ) 

AJ000741 

Bacteriophage P I darA operon 
gi|2462938|emb|AJ00074l!BPAJ7641 [2462938] 

(View GenBank report.FASTA report.ASN.1 repon,Graphical view,! MEDLINE link, 10 protein links, or 31 nucleotide neighbors 
X01828 

Bacteriophage P 1 recombinase gene cin 
gi|15133|emb|X01828|MYPlCIN [15133] 

(View GenBank report,FASTA report^SN.i report,Graphical view.l MEDLINE link, I protein link, or 3 nucleotide neighbors ) 
X98146 

Bacteriophage PI DNA sequence around the Op88 operator 
gi|i359513|emb|X98l46|BP10P880P [1359513] 

(View GenBank repon,FASTA report^SN.l report,Graphical view, or 1 nucleotide neighbor ) 
S61175 

imml operon: icd=cell division repressor, antl=antirepressor {promoters 
P51a, P51b} [bacteriophage PI, Genornic, 728 nt] 
gi|385908|gb|S61 175IS6W75 [385908] 

(View GenBank repon,FASTA reporuASN.l report, Graphical view.l MEDLINE link, or 3 nucleotide neighbors ) 

X87824 

Bacteriophage PI gene 26 
gi|861164|emb|X87824|XXBPiG26[86U64] 

(View GenBank report,FASTA report^SN.l report,Graphical view, or 1 protein link ) 
X15638 

Phage PI DNA for lytic replicon containing promoter P53 and two ooen reading frames 
gi|15735|emb|X15638|PPlLR£P [15735] 

(View GenBank report.FASTA report^.SN.l report,Graphical view.l MEDLINE link, 3 protein links, or 24 nucleotide neighbors * 
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X17512 

Bacteriophage PI DNA for immunity reeion imml 
gi| 1 54791emb|X 1 75 1 2|P I IMMUNIY [15479] 

(View GenBank repon,FASTA repon,ASN. i report,Graphical view,2 MEDLINE links, or 4 nucleotide neighbors ) 
XI 6005 

Bacteriophage PI c 1 gene for Plcl repressor protein 
gi|15477|emb|X16005|PlCl [15477] 

(View GenBank repon.FASTA report,ASN.l report,GraphicaI view. I MEDLINE link, 1 protein link, or 3 nucleotide neighbors ) 
X03453 

Bacteriophage PI ere gene for recombinase protein 
gi|15135|emb|X03453|MYPlCRE [15135] 

(View GenBank report,FASTA report,ASN.l repon,Graphicai view.l MEDLINE link, 2 protein links, or 12 nucleotide neighbors ; 
X06561 

Bacteriophage PI cl gene 5*-region 
gi|15I28|emb|X06561|MYPiCl [15128] 

(View GenBank report,FASTA report^SN.l report,Graphicai view.l MEDLINE link, 4 protein links, or 6 nucleotide neighbors ) 
V01534 

Bacteriophage PI genome fragment (IS2 insertion spot). This regions contains 

four unidentified reading frames and is known as insertion hot spot for IS2 insertion sequences 

gi|l51I8|emb|V01534|MYOVPl [15118] sequences 

(View GenBank report,FASTA report,ASRl report,Graphical view,I MEDLINE link, 4 protein links, or 3 nucleotide neighbors J 

X56951 
Bacteriophage P I gene 10 
gi|406728|emb|X56951|BPPlGP10 [406728] 

(View GenBank reportJASTA report^SN. 1 report,Graphical view,2 MEDLINE links, 3 protein links, or 1 nucleotide neighbor ) 
K02380 

Bacteriophage PI replication region including repA, parA, and parB genes and 
incA, incB, and incC incompatibility determinants 
gi|215652|gb|K02380|PPlREP [215652] 

(View GenBank report,FASTA reporUSN.l report,GraphicaI view,5 MEDLINE links, 4 protein links, or 8 nucleotide neighbors ) 
X87674 

Bacteriophage PI lydA & lydB genes 
gi|974763|emb|X87674[BACP 1LYD [974763] 

(View GenBank report,FASTA reporUSN.l report,Graphical view.l MEDLINE link,- 2 protein links, or 2 nucleotide neighbors J 

X87673 
Bacteriophage PI gene 17 
gi|974761|emb|X87673|BACPl 17 [974761] 

(View GenBank report,FASTA reporUSN.l report,Graphical view, I MEDLINE link, 1 protein link, or 1 nucleotide neighbor ) 
M16618 

Bacteriophage Plcl repressor binding sites 
gi|215600|gbJM16618|PPlCl [215600] 

(View GenBank report,FASTA reporUSN. 1 report,GraphicaI view, 1 MEDLINE link, 2 protein links, or 3 nucleotide neighbors ) 
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SEG PP1CIN 

(View GenBank report,FASTA repon,ASN.l report,Graphical view.l MEDLINE link, 1 protein link, or 4 nucleotide neighbors ) 
K03I73 

Bacteriophage PI C avertible element, right end, and cixR recombination site 
gi|2 1 5606|gb|K03 1 73|PP I CIN2 [2 15606] 

(View GenBank reportJASTA report,ASN. 1 report, or Graphical view) 
215605 

£SSSS3^!« Kq* 1 reCOmbinaSC ' ^ si "- » d 5 ' » d -f C i-nible clement 

(View GenBank report.FASTA report,ASN.l report, or Graphical view) 
M25470 

Bacteriophage PI tail fiber protein gene, complete cds 
gi|341349|gb|M254701PPlTFPR [341349] 

(View GenBank report,FASTA report^SN.l report,Graphical view,l MEDLINE link, 3 protein links, or 3 nucleotide neighbors , 
M34382 

Bacteriophage PI sim region proteins, complete cds 
gi!2 1 566 1 |gbJM34382|PP 1 SIM [2 1 5661 ] 

(View GenBank report,FASTAreportASN.l rcport,Graphicai view,l MEDLINE link, or 2 protein links ) 
M81956 

Bacteriophage PI R protein (R) gene, complete cds 
gi|215658|gb(M81956|PPlRP [215658] 

(View GenBank rcport,FASTA report^SN.l report,Gra P hical view,l MEDLINE link, 2 protein links, or 4 nucleotide neighbors , 
M37080 

Bacteriophage PI rnini-Pl plasmid origin of replication 
gq215657|gb|M37080|PPlR£POR [215657] 

(View GenBank report,FASTA report,ASN.i report,Graphical view,i MEDLINE link, or 46 nucleotide neighbors ) 
M27041 

Bacteriophage PI ref gene, complete cds 
gi[215650|gb|M27041|PPIREF [215650] 

(View GenBank report,FASTA report^SN.l report,GraphicaI view.l MEDLINE link, 1 protein link, or 1 nucleotide neighbor ) 
L01408 

Bacteriophage P 1 partition protein (parB) gene, 3' end . 
gi|2 1 5642lgb|L0 1 408|PP 1 PARB [2 15642] 

(View GenBank report,FASTA report^SN.l report,Graphical view, 1 protein link, or 41 nucleotide neighbors ) 

SEG_PP1PAR 
Bacteriophage miniplasmid PI parA gene, 5* end 
gi|2l5639|gb||SEG_PPlPAR [215639] 

(View GenBank report,FASTA reporUSN.l report,Graphicai view, 1 MEDLINE link, 2 protein links, or 48 nucleotide neighbors ) 
M36425 

Bacteriophage miniplasmid PI parB gene, 3' end 
gi|2I5638|gb|M36425|PPlPAR2 [215638] 

(View GenBank report,FASTA repor^ASN.l report, or Graphical view) 
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Bacteriophage miniplasmid PI parA gene, 5* end 
gi|2l5637|gb|M36424|PPlPARl [215637] 

(View GenBank report,FASTA report,ASN.l report, or Graphical view) 
Ml 1129 

Bacteriophage PI miniplasmid origin of replication region 
gi|215632|gb|Ml i 129|PP10RIM [215632] 

(View GenBank report,FASTA report^ASN.l report, Graphical view,l MEDLINE link, 1 protein link, or 43 nucleotide neighbors ) 
M25414 

Bacteriophage PI cl repressor binding site, operator 88 (Op88) 
gi(2 1 563 1 |gb|M254 1 4|PP I OP88A [2 15631] 

(View GenBank report,FASTA report.ASN.l report.Graphical view, 1 MEDLINE link, or 3 nucleotide neighbors ) 
M25413 

Bacteriophage PI cl repressor binding site, operator 68 (Op68) 
gi!215630(gb|M25413|PPlOP68A [215630] 

(View GenBank report,FASTA report,ASN. I report,Graphical view, or 1 MEDLINE link ) 
M25412 

Bacteriophage P 1 c 1 repressor binding site, operator 2 1 (Op21) 
gt|215629!gbiM25412|PP10P2iA [215629] 

(View GenBank report,FASTA reporWSN.l report,Graphical view,l MEDLINE link, or 1 nucleotide neighbor ) 
M10510 

Bacteriophage PI recombination site loxR 
gi|215628|gb|Ml0510|PPlLOXR [215628] 

(View GenBank report^ASTA report.ASN.1 report,Graphical view, I MEDLINE link, or I nucleotide neighbor ) 
Ml 0287 

Bacteriophage PI loxP X loxP recombination site 
gi|2 1 5 627|gb[M 1 0287(PP 1 LOXPX [2 1 5627] 

(View GenBank report,FASTA repor^ASN.l report,Graphical view,l MEDLINE link, or 13 nucleotide neighbors ) 
MI0494 

Bacteriophage PI recombination site loxP 
gi|215626|gb|Ml0494|PPiLOXP [215626] 

(View GenBank report,FASTA reporvASN.l report,Grapbical view,l MEDLINE link, or 134 nucleotide neighbors ) 
M10511 

Bacteriophage P 1 recombination site loxL 
gi|215625!gb|M1051 1|PP1L0XL [215625] 

(View GenBank report,FASTA report^SN.l report,Graphical view,l MEDLINE link, or 1 nucleotide neighbor ) 
MI0512 

Bacteriophage Pi recombination site loxB 
gi|2l5624|gb|M10512IPPlLOXB [215624] 

(View GenBank report,FASTA report,ASN.l report.Graphical view, or 1 MEDLINE link ) 
M10145 

Bacteriophage PI genome fragment with recombination site loxP 
gi|2t5623|gb|M10145|PPICREX [215623] 

(View GenBank report,FASTA report,ASN.l report,Graphical view,l MEDLINE link, or 21 nucleotide neighbors ) 
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Bacteriophage P 1 Cin recombinase activated cross over site, junction P/ clone nSHI32fi 
gi|2 1 5622|gb|M 1 3327|PP 1 CN26IV [2 1 5622 J P 

(ViewGenBankreport.FASTArepoaASN.1 report,Graphical view.l MEDLINE link, or 7 nucleotide neighbors ) 
M13325 

B £! -^l 5 ' PI Cin recombina " activated cross over site, junction II, clone pSHI326 
gil215621|gb|MI3325|PPlCN26II [215621] ncpanu« 

(View GenBanic report.FASTA repon.ASN. 1 report,Graphical view.l MEDLINE link, or 1401 nucleotide neighbors ) 
Ml 3323 

Bacteriophage P 1 Cin recombinase activated cross over site, junction IV clone dSHI325 
gi|215620|gb|M13323|PP!CN25IV [215620] P 

(View GenBank report.FASTA rcport,ASNU report,GraphicaI view.l MEDLINE link, or 7 nucleotide neighbors ) 
MI332I 

Bacteriophage PI Cin recombinase activated cross over site, junction II, clone dSHI325 
gt]2 1 56 1 9Igb|M 1 332 1 |PP I CN25II [215619] P 

(View GenBank re P ort,FASTA report,ASK 1 report,Gra P hical view.l MEDLINE link, or 1058 nucleotide neighbors ) 
M13324 

Bacteriophage PI Cin recombinase activated cross over site, junction I, clone oSHI326 
gi|215618|gb|M13324iPPlCIR26I [215618] P*riUZ6 

(View GenBank report,FASTA repor^ASN.l report,GraphicaI view.l MEDLINE link, or 7 nucleotide neighbors ) 
M13319 

Bacteriophage PI Cin recombinase activated cross over site, right junction, clone dSHH27 
gil215617|gbtM13319|PPlCIN27R[215617] 8 J * clone P SHI327 

(V:ew GenBank report,FASTA report,ASN. 1 report,Graphical view, 1 MEDLINE link, or 7 nucleotide neighbors ) 
M13320 

Bacteriophage PI Cin recombinase activated cross over site, junction I clone dSHI325 
gi|215616|gb|M13320|PPICIN25I [215616] P 

(View GenBank report,FASTA repon,ASN. 1 report,Graphical view, I MEDLINE link, or 7 nucleotide neighbors ) 
MI3318 

Bacterionhage PI Cin recombinase activated cross over site, left junction, clone dSHI324 
gi|2156I51gb]M13318|PPlCIN24L [215615] P 

(View GenBank reportJASTA repo^ASN. 1 report,Graphical view. 1 MEDLINE link, or 1370 nucleotide neighbors ) 
MI3317 

f^T^i^L^l^ t ^ 0mbin " C aCtivated ovcr sitc ' ^Junction, clone pSHI323 
gi|215614|gb|M13317|PPlCIN23M [215614] 

(View GenBank report,FASTA rcpor^ASN.l rcport,Graphical view.l MEDLINE link, or 1055 nucleotide neighbors ) 
M133I6 

B -n ?1 Cm recombinasc activated cross over site, left junction, clone pSHI323 

gi|2156l3|gb|M13316|PPlCIN23L [215613] wneponi^j 
(View GenBank report,FASTA report,ASN.l report, Graphical view,! MEDLINE link, or 7 nucleotide neighbors ) 
MI3315 

^Ic^l 8 ^ 1 * 1 Cm rccombinasc activated cross over site, right junction, clone pSHI322 
gi|215612|gb|M13315|PPlCIN22R [215612] 

(View GenBank report,FASTA report,ASN.l report,Graphicai view.l MEDLINE link, or 7 nucleotide neighbors ) 
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M13314 

Bacteriophage PI Cin recombinase activated cross over site, left junction, clone pSHI322 
g i|2 1 5 6 1 1 lgb|M 1 3 3 1 4 |PP 1 CIN22L [215611] 

(View GenBank report,FASTA report,ASN.l report,Graphical view.l MEDLINE link, or 1401 nucleotide neighbors ) 
M13313 

Bacteriophage PI Cin recombinase activated cross over site, right junction, clone pSHD21 
gil2 1 56 10|gbiM 1 3 3 1 3|PP 1 CIN2 1 R [2 1 561 0] 

(View GenBank report,FASTA repoi^ASN.l report,Graphical view,l MEDLINE link, or 7 nucleotide neighbors ) 
M13312 

Bacteriophage PI Cin recombinase activated cross over site, left junction, clone pSHI321 
gi|215609|gb|M!3312|PPlCIN21L [215609] 

(View GenBank report,FASTA report,ASN.l report.Graphical view,! MEDLINE link, or 1058 nucleotide neighbors ) 
Ml 6568 

Bacteriophage P 1 c4 repressor gene, complete cds 
gi[215603|gb|M16568|PPIC4 [215603] 

(View GenBank report,FASTA report^ASN. 1 report,Graphical view, t MEDLINE link, I protein link, or 4 nucleotide neighbors ) 
M13326 

Bacteriophage PI Cin recombinase activated cross over site, junction III, clone pSHI326 
gii215602|gb|M13326|PPlC26m [215602] 

(View GenBank report,FASTA report,ASN. i report,Graphical view,l MEDLINE link, or 1 192 nucleotide neighbors ) 
M 13322 

Bacteriophage PI Cin recombinase activated cross over site, junction m, clone pSHI325 
gil2l5601|gb|M13322|PPlC25in [215601] 

(View GenBank report,FASTA report^SN.l report,Graphical view, I MEDLINE link, or 1231 nucleotide neighbors ) 
J05651 

Bacteriophage Pi modulator protein (bof) gene, complete cds 
gi|215598|gb|J05651|PPlBOFYl [215598] 

(View GenBank report,FASTA report^ASN.l report,Graphicai view, I MEDLINE link, I protein link, or 3 nucleotide neighbors ) 
M33224 

Bacteriophage P 1 regulatory protein (bof) gene, complete cds 
gi|215596|gb|M33224|PPlBOFFO [215596] 

(View GenBank report,FASTA report^SN. 1 report,Graphical view, 1 MEDLINE link, 1 protein link, or 3 nucleotide neighbors ) 

\ 

M10288 

E.coli/bacteriophage PI loxR recombination site 
giil46647|gb|M10288|ECOLOXR [146647] 

(View GenBank report,FASTA report,ASN.l report,GraphicaI view, I MEDLINE link, or 3 nucleotide neighbors ) 
M10289 

EcohTbacteriophage PI loxL recombination site 
gi|146646|gb|M10289|ECOLOXL [146646] 

(View GenBank report,FASTA report^ASN.l report,Graphical view, 1 MEDLINE link, or 2 nucleotide neighbors ) 
M 10290 

E.coli loxB site, which can recombine with bacteriophage PI loxP site 
gi|l46645!gb(Ml0290|ECOLOXB [146645] 

(View GenBank report,FASTA reportASN. 1 report,Graphical view, 1 MEDLINE link, or 2 nucleotide neighbors ) 
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M 10287 

Bacteriophage P i loxP X loxP recombination site 
gi|2l5627|gb|M10287|PPlLOXPX [215627] 

(View GenBank report.FASTA report.ASN.l report.Graphical view.l MEDLINE link, or 13 nucleotide neighbors ) 
M74046 

Bacteriophage PI pacA and pacB genes, complete cds 
gi|2 1 5634|gb|M74046|PP 1 PACAB [2 15634] 

(View GenBanJc report.FASTA report.ASN.1 report,Graphical view, I MEDLINE link, or 2 protein links ) 
M95666 

Bacteriophage PI gene 10, doc and phd genes, complete cds 
gi|463276|gb|M95666|PPIPHDDOC [463276] 

(View GenBank report.FASTA report,ASN.l report.Graphical view,2 MEDLINE links, 4 protein links, or 1 nucleotide neighbor , 
M25604 

Bacteriophage Q-beta mutated autonomously replicating sequence MDVI RNA fraement 
gi|556359|gb|M25604fPQBARSMUT [556359] ^ 

(View GenBank report.FASTA report,ASN. 1 report,Graphical view, 1 MEDLINE link, or 8 nucleotide neighbors ) 
V00643 

first half of the phage Q-beta gene for coat protein 
gi|15088|emb|V00643|LEQBET [15088] 

(View GenBank report.FASTA report.ASN.1 report.Graphical view.l MEDLINE link. 1 protein link, or 4 nucleotide neighbors ) 
M25167 

Bacteriophage Q-beta RNA fragment recovered from replicase binding complex 
giI556362|gb|M25 1 67|PQBREPUCB [556362] 

(View GenBank report,FASTA repon^SN. 1 report,Graphical view, I MEDLINE link, or 2 nucleotide neighbors ) 
M24876 

Bacteriophage Q-beta replicase RNA, 5' end 
gi|556360|gb|M24876|PQBREPUCA [556360]- 

(View GenBank report.FASTA reporV\SN.l report,Graphical view.l MEDLINE link, 1 protein link, or 4 nucleotide neighbors ) 
M25444 

Synthetic bacteriophage Q-beta DNA 

gi|2091 18|gb|M25444|SVNPQBTERM [209118] 

(View GenBank report,FASTA report^SN.l report,Graphical view.l MEDLINE link, or 8 nucleotide neighbors ) 
M25463 

Bacteriophage Q-beta self-replicating microvariant (+)'RNA 
gi|532489|gb|M25463|PQBMVSRRNA [532489] 

(View GenBank reportJASTA report^SN. 1 report,Graphical view, or i MEDLINE link ) 
M25014 

Bateriophage Q-beta RNA replicase gene, 5'end, and maturation protein gene, 3' end 
gi|294316|gb|M25014|PQBREPLC [294316] 

(View GenBank report,FASTA report^SN.l report,Graphical view, 1 MEDLINE link. 2 protein links, or 2 nucleotide neighbors ) 
M25065 

Bacteriophage Q-beta RNA sequence with putative stem loop 
gi|2943 15|gb(M25065|PQBLOOP [2943 15] 

(View GenBank report,FASTA reporUSN.l report,Graphical view.l MEDLINE link, or 3 nucleotide neighbors) 
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M10265 

Bacteriophage Q-beta RNA molecule with the ability to replicate extracellularly 
gi|2 1 5726|gb|M 1 0265|PQBRNA [2 1 5726] 7 

(View GeaBank report.FASTA repon,ASN.l report,Graphical view, i MEDLINE link, or 8 nucleotide neighbors ) 
M24815 

Bacteriophage Q-beta specified replicate subunitRNA 
gi!2 1 5725|gb|M248 1 5|PQBR£PL [2 15725] 

(View GenBank report,FASTA report,ASN.l report,Graphical view,l MEDLINE link, or 4 nucleotide neighbors ) 
M25461 

Bacteriophage Q-beta plus-strand RNA. 5' terminus 
gi|2 l5724|gb|M2546 1 (PQBPS5E (2 1 5724] 

(View GenBank report.FASTA repon,ASN.l report, or Graphical view) 
M25462 

Bacteriophage Q-beta plus-strand RNA, 3* terminus 
gi|215723|gb|M25462|PQBPS3E [215723] 

(View GenBank report,FASTA repor^ASN.l report,Gra P hical view, or 8 nucleotide neighbors ) 
M24871 

Bacteriophage Q-bcta nanovariant WSHI RNA 
gi|215722|gb|M2487l|PQBNVWSIC [215722] 

(View GenBank repon,FASTA report,ASN.l rcport,Graphical view,! MEDLINE link, or 2 nucleotide neighbors ) 
M24870 

Bacteriophage Q-beta nanovariant WSII RNA 
gi|21572l|gb|M24870|PQBNVWSIB [215721] 

(View GenBank report,FASTA report^SN.l report,Graphical view, 1 MEDLINE link, or 2 nucleotide neighbors ) 
M24869 

Bacteriophage Q-beta nanovariant WSI RNA 
gi|215720|gb|M24869|PQBNVWSIA [215720] 

(View GenBank report,FASTA report,ASN.l report,GraphicaI view,l MEDLINE link, or 2 nucleotide neighbors ) 
M10495 

Coliphage Q-beta MDV-1(+) RNA 
gi|215719|gb|M10495|PQBMDVlA [215719] 

(View GenBank report,FASTA report,ASN.l report,Graphical view.l MEDLINE link, or 10 nucleotide neighbors ) 
J02484 

bacteriophage qbeta coat protein cistron first half 
gi|215717|gb|J02484|PQBCP5 [215717] 

(View GenBank report,FASTA report^SN.l report,Graphical view,l MEDLINE link, 1 protein link, or 4 nucleotide neighbors ) 
M57754 

Bacteriophage Q-beta minus strand RNA, 5* terminus 
giJ215716|gb|M57754|PQBBMS5E [215716] 

(View GenBank report.FASTA report^SN.l report,Graphical view, or 8 nucleotide neighbors ) 
M24297 

Bacteriophage Q-beta 5'- terminal region of the minus strand 
gi|2 1 57 1 5|gb|M24297|PQB5END [21571 5] 

(View GenBank report,FASTA repoi%ASN.l report,Graphical view,l MEDLINE link, or 8 nucleotide neighbors ) 
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M10695 2 1S 
Bacteriophage Q-beta, MDV- 1 RNA 
gi|2 1 57 1 4|gb|M 1 0695|PQB 1 IR (2 157 14] 

(View GenBank report,FASTA report,ASN.l report,Graphicat view,2 MEDLINE links, or 12 nucleotide neighbors ) 
M24827 

Bacteriophage R17 A proEein gene, 5' end 
gi|216078|gb|M24827|R17RNACIS [216078] 

(View GenBank rcport,FASTA report,ASN.l report,Graphicai view.l MEDLINE link, or 5 nucleotide neighbors ) 
M24829 

Bacteriophage R17 coat protein gene, 5' end 
gi!2 1 6075|gb|M24829|RI 7CP5 [2 1 6075] 

(View GenBank report,FASTA report,ASN.l report.Graphical view, J MEDLINE link, or 5 nucleotide neighbors ) 
J02488 

bacteriophage r!7 ma synthetase initiation site 
gi|2 1 6080|gb|J02488|Rl 7RNASYN [2 1 6080] 

(View GenBank report,FASTA report^SN.l report.Graphical view t 3 MEDLINE links, 2 protein links, or 6 nucleotide neighbors ) 
J02487 

\ bacteriophage rl 7 coat protein initiation site 

gi|216073|gb|J02487|R17COATP [216073] 

(View GenBank report,FASTA reporuASN.l report,Graphical view, or 1 MEDLINE link ) 
J02486 

bacteriophage rl7 a protein initiation site 
gi|2!6071|gb|J02486|Rl7APROT [216071] 

(View GenBank report.FASTA repor^ASN.l report,Graphicai view, or 1 MEDLINE link ) 
M24826 

Bacteriophage R17 coat protein RNA fragment 
gi|2 1 6077|gb|M24826|Rl 7CPRAA [2 16077] 

(View GenBank report,FASTA repor^ASN.l report,Graphical view.l MEDLINE link, or 7 nucleotide neighbors ) 
M24296 

Bacteriophage RI7 3'-terminaI fragment A RNA 
gi|2 1 6070|gb|M24296|Rl 73TFA [2 16070] 

(View GenBank report,FASTA report^SN.l report,Graphical view.l MEDLINE link, or 9 nucleotide neighbors ) 
ITFN 

structure refinement for a 24-nucIeotide ma hairpin, nmr, rninirnized average 

structure ribonucleic acid, hairpin, bacteriophage rl7 mol id: 1; molecule: rl7c; chain: null: engineered: yes 
gi|l942336fpdb|lTFN| [1942336] 

(View GenBank rcportJrASTA reportASN.l report,Graphical view, or 1 structure link ) 
1RPEA 

ma (5'-d(gpgpgpapcpupgpapcpgpapupcpapcpgp cpapgpupcpupapu-3') (24-mer ma 
hairpin coat protein binding site for bacteriophage rl 7) (nmr, rninirnized average structure) 
gi|142l020|pdb|lRHTl [1421020] 

(View GenBank report,FASTA report^ASN.l report,Graphical view, or I structure link ) 
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Ml 4428 

Bacteriophage S13 circular DNA, complete genome 
gi!216089|gb[M14428|S13CG [216089] 

^Zm^) 90 ^^ Ttp0n * ASN ' 1 re P°«. G »P^al view .2 MEDLINE links, 12 protein links, 26 nucleotide neighbors. 
J05393 

Bacteriophage Tl DNA N-6-adeiiine-methyltransferase (M.T!) gene, complete cds 
giJ!66I63|gb|J05393|BTlNAMTA [166163] 

(View GenBank report,FASTA report.ASN.1 report,Graphical view, 1 MEDLINE link, or 2 protein links ) 
L46845 

Bacteriophage 12 frd3, frd2 genes, comolete cds 
gi;95 1 387|gb|L46845|PT2FRD32G [95 1387] 

(View GenBank report,FASTA report.ASN.1 rcport.Graphicai view,2 protein links, or 17 nucleotide neighbors ) 
L43611 

Bacteriophage T2 fibritin (wac) gene, complete cds 
gi!903869|gb|L436li|PT2WAC [903869] 

(View GenBank report,FASTA report,ASN.l report,GraphicaI view.l protein link, or 4 nucleotide neighbors ) 
M248I2 

Bacteriophage T2 secondary structure RNA sequence 
gi|2l5796|gb|M24812|PT2RNA [215796] 

(View GenBank repooFASTA report^ASN. I rcport,Graphical view,l MEDLINE link, or 4 nucleotide neighbors ) 
M22342 

Bacteriphage T2 DNA.(adenine-N6)methyltransferase (dam) gene, complete cds 
gi|2l5792|gb|M22342|PT2DAM [215792] 

(View GenBank repon,FASTA report,ASN.l report,Graphicai view.l MEDLINE link, I protein link, or 2 nucleotide neighbors ) 
S57515 

orf 61.2 {intergenic region between 41 and 61} [bacteriophage T2, Genomic, 323 nil 
gi|298524|gb|S575l5|S57515 [298524] 

(View GenBank report,FASTA reportASN.i report,Graphical view.l MEDLINE link, or 1 protein link ) 
X05312 

Bacteriophage T2 gene 38 for receptor recognizing protein 
gilI5197|emb|X05312|Mrr2G38 [15197] 

(View GenBank report,FASTA report, ASN.l report,Grapbicai view.l MEDLINE link, or 1 protein link ) 
X04442 

Bacteriophage T2 gene 37 for receptor recognizing protein 
gi|l5195|emb|X04442|MYT2G37 [15195] 

(View GenBank report,FAST A report^ASN. 1 report,Graphical view. 1 MEDLINE link, or 1 protein link ) 
X 12460 

Bacteriophage T2 gene 32 mRNA for single-stranded DNA binding protein 
gi|15192|emb|X12460|MYT2G32 [15192] 

(View GenBank report,FASTA report^SN.! report,Graphical view.l MEDLINE link, 2 protein links, or 14 nucleotide neighbors ) 
X57797 

Bacteriophage 12 gene for gpl2 
gi|I4875femb|X56555|BT2GPl2 [14875] 

(View GenBank report,FASTA report,ASN.l report,Graphical view.l protein link, or 2 nucleotide neighbors ) 
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X01755 

Bacteriophage T2 tail fiber gene 36 

gi| 1 5 i 89iemb|X0 1 755|MYT2F3 6 [15189] 

(View GenBank repon.FASTA r e port,ASN.i report.Graphicai view. 1 MEDLINE link, 2 protein links, or 1 nucleotide neighbor ) 
Ml 4784 

gSXS 

(View GenBank report,FASTA report,ASN. I report,GraphicaI view,! MEDLINE link, 9 protein links, or 10 nucleotide neighbors ) 

SEG_PT3RNAPOL 
Bacteriophage T3 RNA polymerase III gene, 5' end 
gi|710559|gb||SEG_PT3RNAPOL [710559] 

(View GenBank report,FASTA report.ASN.1 report.Graphicai view. 1 MEDLINE link, 2 protein links, or 2 nucleotide neighbors ) 
M22610 

Bacteriophage T3 RNA polymerase HI gene, 3' end 
gi|340722|gb{M22610|PT3RNAPOL2 [340722] 

(View GenBank rcport,FASTA report, ASN.l report, or Graphical view) 
M22609 

Bacteriophage T3 RNA polymerase in gene, 5' end 
gi|340721|gb|M22609|PT3RNAPOLl [340721] 

(View GenBank report,FASTA report r ASN.l report, or Graphical view) 
X05031 

Bacteriophage T3 gene region 1-2.5 with primary origin of replication 4 
gi|15719|emb|X05031|POT3ORI [15719] 

(View GenBank report,FASTA report,ASN.l report,GraphicaI view.! MEDLINE link, 1 1 protein links, or 5 nucleotide neighbors ) 
X03964 

Bacteriophage T3 early control region pos. 308-810 from genome left end 
gi|15718!emb|X03964|POT3EP [15718] 

(View GenBank report,FASTA rcporWSN.l report,Graphical view,2 MEDLINE links, or 20 nucleotide neighbors ) 
X17255 

Bacteriophage T3 gene 1 to gene 11 
gi|15682!emb|X17255|FOT31 UG [15682] 

(View GenBank report.FASTA report^SN.l report,GraphicaI view,4 MEDLINE links, 36 protein links, 17 nucleotide neighbors 
or l genome link ) 5 ' 

X15840 
Phage T3 gene 10 

gi|15625|emb|X15840|PODT3Gl0 [15625] 

(View GenBank report^FASTA report^VSN.l report,Graphical view,l MEDLINE link, or 3 nucleotide neighbors ) 
X02981 

Bacteriophage T3 gene 1 for RNA polymerase 
gi|15561|emb|X0298I|PODOT3P [15561] 

(View GenBank report,FASTA reporUSN.l report,Graphical view.l MEDLINE link, 1 protein link, or 3 nucleotide neighbors ) 
J02503 

bacteriophage t3 5* end, terminally redundant sequence (trs) 
giI2l5816|gb!J02503|PT3TRSl [215816] 

(View GenBank reportJFASTA reportASN.l report, or Graphical view) 
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SEG_PT3TRS 

bacteriophage i3 5' end, terminally redundant sequence (trs) 
gi|2l58I8|gb||SEG_PT3TRS [215818] 

(View GenBank report,FASTA report,ASN.l report,Graphical view, or 1 MEDLINE link ) 
J02504 

bacteriophage t3 3' end, terminally redundant sequence (trs) 
gi|2I5817|gb|J02504|PT3TRS2 [215817] 

(View GenBank report,FASTA report,ASN.l report, or Graphical view} 

HYPERLI>fKhtm://wwwTS.noda.sm.acjp/-tainisawa h t tp://www.rs.noda.sut.ac.jp/-kunisawa 
Bacteriophage T4 genomic database compiled by Arisaka et al. 

X95646 

Bacteriophage T5 DNA for region 60.5%-71% of the T5 genome 
gii2791557|emb|AJOO1191|BTJ0OU91 [2791557] 

(View GenBank report,FASTA reporUSN. 1 report,Graphical view,7 MEDLINE links, 12 protein links, or 6 nucleotide neighbors ) 
X56847 

Bacteriophage T5 genomic region encoding early genes D10-D15 
giil5407|emb[X12930|MYT5D10 [15407] 

(View GenBank report.FASTA report^SN.l report,Graphical view, 1 MEDLINE'link, 5 protein links, or 4 nucleotide neighbors » 
AF039886 

Bacteriophage T5 subclone T5.5.3r5.18r, single pass sequence, genomic survey sequence 

gi|28M 154{gb|AF039886|AF039886 [2811154] 

(View GenBank report^FASTA report^SN.l report, or Graphical view) 

AF039885 

Bacteriophage T5 subclone T5.40f,4lf, single pass sequence, genomic survey sequence 

gi!28IH53|gb|AF039885|AF039885 [2811153] 

(View GenBank report,FASTA report^SN.l report, or Graphical view) 

AF039884 

Bacteriophage T5 subclone T5.26.fr, single pass sequence, genomic survey sequence 

gi|282 H52jgb|AF039884|AF039884 [2811 152] 

(View GenBank repon.FASTA report,ASN. 1 report, or Graphical view) 

AF039883 

Bacteriophage T5 subclone 10-T5.5.7F, single pass sequence, genomic survey sequence 
gi|2811151|gb|AF039883|AF039883 [2811151] - 
(View GenBank report^ASTA reportASN.l report, or Graphical view) 

AF039882 

Bacteriophage T5 subclone *41-T5.5.4BF, single pass sequence, genomic survey sequence 

gi|2811 150|gb|AF039882|AF039882 [28 1 1 150] 

(View GenBank report,FASTA report^ASN.l report, or Graphical view) 

AF039881 

Bacteriophage T5 subclone 39-T5.5.4aF, single pass sequence, genomic survey sequence 
gi|281 1 149|gb|AF03988l|AF03988 1 [28 11 149] 

(View GenBank report,FASTA report,ASN.l report,Graphical view, or 1 
nucleotide neighbor ) 
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AF039880 

Bacteriophage T5 subclone 19-T5.7.2r, single pass sequence, genomic survey sequence 

gi|281 1 148|gb|AFO39880|AFO39880 [281 1 148] 

(View GenBank report,FASTA rcportASN.l report, or Graphical view) 

AF039879 

Bacteriophage T5 subclone 18-T5.7.2F, single pass sequence, genomic survey sequence 

gil28U147|gb|AF039879|AF039879 [281 1147] 

(View GenBank report,FASTA report^ASN.l report, or Graphical view) 

AF039878 

Bacteriophage T5 subclone 1 1-T5.5.7R, single pass sequence, genomic survey sequence 
gi|28U l'46|gb|AF039878|AF039878 [28 U 146] 

(View GenBank report,FASTA repon^SN.l report,Graphical view, or 2 
nucleotide neighbors ) 

AF039877 

Bacteriophage T5 subclone T5.4FR, single pass sequence, genomic survey sequence 

gi]28ll!45|gb|AF039877|AF039877 [2811145] 

(View GenBank report,FASTA report,ASN.l report, or Graphical view) 

AF039876 

Bacteriophage T5 subclone 22-T5.16R, single pass sequence, genomic survey sequence 

giI28 1 1 144fgb|AF039876|AF039876 [28 11 1 44] 

(View GenBank report,FASTA repoaASN.l report, or Graphical view) 

AF039875 . 

Bacteriophage T5 subclone 21-T5.16R, single pass sequence, genomic survey sequence 

gi|281U43|gb|AF039875|AF039875 [2811143] 

(View GenBank report,FASTA repor^ASN.l report, or Graphical view) 

AF039874 

Bacteriophage T5 subclone 21-T5.16F, single pass sequence, genomic survey sequence 

gi|28 1 1 l42|gb!AF039874|AF039874 [28 1 1142] 

(View GenBank reporvFASTA report^ASN.l report, or Graphical view) 

AF039873 

Bacteriophage T5 subclone 09-T5.6T, single nass sequence, genomic survey sequence 

gi|28 1114 l|gb|AF039873|AF039873 [281 1 141] 

(View GenBank reportJASTA reportASN.l report, or Graphical view) 

AF039872 

Bacteriophage T5 subclone 09-T5.6R, single pass sequence, genomic survey sequence 
gi|28 1 1 140|gb|AF039872|AF039872 [2811 140] 

(View GenBank report,FASTA reporiASN.l report,Graphical view, or 2 nucleotide neighbors ) 
AF039871 

Bacteriophage T5 subclone 04-T5.26.R, single pass sequence, genomic survey sequence 

gi|2811139Igb|AF039871|AF039871 [2811139] 

(View GenBank report,FASTA reporiASN.i report, or Graphical view) 

AF039870 

Bacteriophage T5 subclone 13-T5.42F, single pass sequence, genomic survey sequence 

gi|2811 138|gb|AF039870|AF039870 [2811 138] 

(View GenBank reportfASTA report^ASN.l report, or Graphical view) 
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X69460 

Bacteriophage T5 ltf gene for L-shaped tail fibers 
gi|154!5|emb|X69460|MYT5LTF [15415] 

(View GenBank repon t FASTA report,ASN. I report,GraphicaI view,2 MEDLINE links, 1 protein link, or 4 nucleotide neighbors ) 
X03402 

Bacteriophage T5 D 1 5 gene for 5* exonuclease 
gi|15413|emb|X03402|MYT5EXOG [15413] 

(View GenBank repon.FASTA report^SN.I report,Graphical view,l MEDLINE link. 1 protein link, or 2 nucleotide neighbors ) 
211972 

Bacteriophage T5 tRNA-Tyr, tRNA-Glu, tRNA-Trp, tRNA-Phe, tRNA-Cys and 
tRNA-Asn genes, and ORFs 91aa, 90aa, 42aa and 172aa 
gi|15795|emb|21 1972|T56TRNAG [15795] 

(View GenBank repon.FASTA repor^ASN.l report,Graphical view.l MEDLINE link, 4 protein links, or 3 nucleotide neighbors ) 
X03898 

Bacteriophage T5 genes for tRNA-His, -Ser and -Leu 
gi!15786iemb|X03898|STT5RNl [15786] 

(View GenBank repon,FASTA report^SN. 1 rcport.Graphical view, or 2 MEDLINE links ) 
X04177 

Bacteriophage T5 gene for transfer RNA-GLn 
gi|15421Jemb|X04177|MYT5TRNQ [15421] 

(View GenBank repon,FASTA report^ASN. 1 report,Graphical view, 1 MEDLINE link, or 2 nucleotide neighbors ) 
X03899 

Bacteriophage T5 genes for tRNA-Val, -Lys, -fMet, -Pro and -De3 
gi|15787|emb|X03899|STT5RN2 [15787] 

(View GenBank report,FASTA report,ASN. I report,Graphical view, or 1 MEDLINE link ) 
X03798 

Bacteriophage T5 gene for tRNA-Asp (GUC) 
gqi5472|emb|X03798|NCT5TKDG [15472] 

(View GenBank rcport,FASTA report^SN. 1 report,Graphical view,l MEDLINE link, 2 protein links, or 2 nucleotide neighbors ) 
Y00364 

Bacteriophage T5 tRNA gene cluster (27.8%-22.4%) 
gi|15420|emb|Y00364|MYT5TRN [15420] 

(View GenBank report,FASTA report^ASN.l rcport,Graphicai view,l MEDLINE link, or 13 nucleotide neighbors ) 
X03140 

Bacteriophage T5 DNA with rho-dependent transcription terminator (Hind m-P fragment) 
gi|15417|emb|X03140|MYT5RHO [15417] 

(View GenBank report,FASTA report^SN.l report,Graphical view, 1 MEDLINE link, 2 protein links, or 2 nucleotide neighbors ) 

Z35070 
Bacteriophage T6 DNA 

gi|535228|emb|Z35074!MYEREGBT6 [535228] 

(View GenBank repon,FASTA report,ASN. 1 report,Graphical view, 1 MEDLINE link, or 1 protein link ) 
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AF06O870 

Coliphage 16 small subunit distal tail fiber (gene 36) gene, partial cds; and large subunit distal tail fiber (gene 37) and tail fiber 
adhesm (gene 38) genes, complete cds 
gi|3676458|gb|AF052605|AF052605 [3676458] 

(View GenBank report,FASTA repooASN.l report,Graphical view.3 protein links, or 2 nucleotide neighbors ) 
235072 

Bacteriophage T6 DNA encoding ORF19.1 gene and g!9 gene 
gi|535232|cmb|Z35072|MYTAILT6 [535232] 

(View GenBank report,FASTA report,ASN. 1 report,Graphical view, 1 MEDLINE link, or 2 protein links ) 
X12488 

Bacteriophage T6 gene 32 mRNA for single-stranded DNA bindine protein 
. gi|l5843|emb|X12488|MYT6G32 [15843] 

(View GenBank repon,FASTA report^ASN. 1 report t Graphical view, 1 MEDLINE link, I protein link, or 14 nucleotide neighbors ) 

Z78095 

Bacteriophage T6 DNA (1506 bp) 
gi|1488562|emb|278095|BPHZ78095 [1488562] 

(View GenBank report,FASTA rcportASN. 1 report,Graphical view, 1 protein link, or 4 nucleotide neighbors ) 
235079 

Bacteriophage T6 DNA for Ip5, Ip6 
giI535215|emb|Z35079|MY57BT6 [535215] 

(View GenBank report,FASTA rcport^ASN. 1 report,Graphical view, I MEDLINE link, 2 protein links, or 1 nucleotide neighbor ) 
X68725 

E.coli bacteriophage T6 gene for beta-glucosyl-HMC-alpha-glucosyl-transferase 
gi|296439|emb|X68725|ECT6 [296439] 

(View GenBank report,FASTA report^SN.i repon,Graphical view,l MEDLINE link, 3 protein links, or 1 nucleotide neighbor ) 
X69894 

Bacteriophage T6 alt gene for ADP-Ribosyltransferase 
gi|I5422|emblX69894|MYT6ADP [15422] 

(View GenBank repooFASTA report^SN.l report,GrapbicaI view.l MEDLINE link, 1 protein link, or 1 nucleotide neighbor ) 
L46846 

Bacteriophage T6 frd3, frd2 genes, complete cds 
gi|951390|gb|L46846|PT6FRD32G [951390] 

(View GenBank report^ASTA rcport r ASN. 1 report,Graphical view, or 2 protein links ) 
M27738 

Bacteriophage T6 translational repressor protein (regA), complete cds 
gi|215993|gb|M27738|PT6REGA [215993] 

(View GenBank rcport,FASTA rcportASN.l rcport.Graphical view,l MEDLINE link, 1 protein link, or 5 nucleotide neighbors ) 
M38465 

Bacteriophage T6 DNA ligase gene, complete cds 
gi|2 1 599 1 |gb|M38465|PT6LIG55 [215991] 

(View GenBank report,FASTA report,ASN.l report,Graphical view,l MEDLINE link, 1 protein link, or 2 nucleotide neighbors ) 
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V01 146 
Genome of bacteriophage T7 
gi!43I187|emb|V01 146JT7CG [431187] 

(View GenBank report,FASTA report,ASN.l report,Graphical view,13 MEDLINE links, 60 protein links 105 nucleotide 
neighbors, or 1 genome link ) 

X60322 

Bacteriophage alpha3 genes A, B, K, C, D r E, J, F, G, H 
gi|I4775|embiX60322|BACALPHA [14775] 

(View GenBank report,FASTA report,ASN. 1 report,Graphical view, 1 MEDLINE link, 1 0 protein links, 22 nucleotide neighbors 
or 1 genome link ) 5 

X13332 

Bacteriophage alpha3 DNA for origin of replication 
gi|I5093|emb|X13332|MIA3ORPL [15093] 

(View GenBank rcport,FASTA repon^ASN.l report,GraphicaI view, or 1 MEDLINE link ) 
X12611 

Bacteriophage alpha3 gene for protein A part, finger domain 
gi|15092|ernb|X1261 1|MIA3AFIN [15092] 

(View GenBank report,FASTA report,ASN. 1 report,Graphical view, 1 MEDLINE link, 1 protein link, or 6 nucleotide neighbors ) 
X15721 

Bacteriophage alpha3 deletion mutation DNA for the origin region (-ori) of replication 
gi| 1 4774|emb|X 1 572 1 |B A3DMOR9 [14774] 

(View GenBank report,FASTA report,ASN. I report,Graphical view, I MEDLINE link, or 11 nucleotide neighbors ) 
X15720 

Bacteriophage alpha3 deletion mutant DNA for the origin region (-ori) of replication 
gi|14773|emb|XI5720|BA3DMOR8 [14773] 

(View GenBank report,FASTA repor^ASN. 1 report,Graphical view, 1 MEDLINE link, or 1 nucleotide neighbor ) 
X15719 

Bacteriophage alpha3 insertion mutant DNA for the origin region (-ori) of replication 
gi|14772|emb|X15719|BA3DMOR7 [14772] 

(View GenBank report,FASTA report^SN.l report,Graphical view,l MEDLINE link, or 10 nucleotide neighbors ) 
X15718 

Bacteriophage alpha3 deletion mutation DNA for origin region (-ori) of replication 
gi| 1477 1 (embpC 1 57 1 8|BA3DMOR6 [ 1477 1 ] 

(View GenBank report^ASTA report^\SN.l report,Graphical view,l MEDLINE iink/or 11 nucleotide neighbors ) 
X157I7 

Bacteriophage alpha3 deletion mutatnt DNA for origin region (-ori) of replication 
gi|14770|emb|X15717|BA3DMOR5 [14770] 

(View GenBank reportJASTA report^SN. 1 report,Graphical view, 1 MEDLINE link, or 9 nucleotide neighbors ) 
X15716 

Bacteriophage alpha3 deletion mutant DNA for origin region (-ori) of replication 
gi|14769|emb|X15716|BA3DMOR4 [14769] 

(View GenBank repon,FASTA report,ASN.l report,Graphical view,l MEDLINE link, or 10 nucleotide neighbors ) 
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X15715 

Bacteriophage aipha3 deletion mutant DNA for origin region (-ori) of of replication 
gi|14768|emb|XI5715|BA3DMOR3 [14768] P 

(View GenBank report.FASTA rcport,ASN.i report,Gra P hical view.l MEDLINE link, or U nucleotide neighbors ) 
X157I4 

Bacteriophage alpha3 deletion mutant DNA for origin region (-ori) of replication 
gi[14767|emb|X15714|BA3DMOR2 [14767] 

(View GenBank report,FASTA report.ASN.1 report,Graphical view, I flEDLINE link, or li nucleotide neighbors ) 
X15713 

Bacteriophage alpha3 deletion mutant DNA for the origin region (-ori) of replication 
gi|14766|emb|X15713|BA3DMORl [14766] <pncanon 

(View GenBank report,FASTA report,ASN.l report.Graphical view, I MEDLrNE link, or 11 nucleotide neighbors ) 
X62059 

Bacteriophage alpha3 origin of cDNA synthesis (oriGA) 
gi|14763Iemb|X62059IAL3ORIGA [14763] 

(View GenBank report,FASTA reporuASN.l repoaGraphical view.l MEDLINE link, or 13 nucleotide neighbors ) 
X62058 

Bacteriophage alpha3 origin of cDNA synthesis (oriAA) 
gi|14762|emb|X62058|AL3ORlAA [14762] 

(View GenBank report.FASTA repor^ASN.l report,Graphical view, I MEDLINE link, or 13 nucleotide neighbors ) 
J02444 

Bacteriophage aipha3 origin of DNA replication 
gi|166103|gb|J02444|AL3ORI [166103] 

(View GenBank report,FASTA repoaASN.l report,Graphical view.l MEDLINE link, 2 protein links, or 12 nucleotide neighbors ) 
M25640 

Bacteriophage alpha-3 H protein gene, complete cds 
gi|166101|gb|M25640|AL3HP (166101] 

(View GenBank itpartJASTA reporUSN.l report,GraphicaI view.l MEDLINE link, 1 protein link, or 13 nucleotide neighbors ) 
M10631 

Bacteriophage alpha-3 cleavage site for phage phi-XI74 gene A protein 
gi|166099|gb|M10631|AL3CSA [166099] 

(View GenBank report,FASTA reporUSN.l report,Graphical view.l MEDLINE link, 1 protein link, or 3 nucleotide neighbors ) 
X00774 

Bacteriophage alpha-3 gene J sequence 
gi|15431|cmb|X00774|NCBA3J [15431] 

(View GenBank report,FASTA report^SN.l report,Graphical view.l MEDLINE link, 3 protein links, or 2 nucleotide neighbors ) 
M25640 

Bacteriophage alpha-3 H protein gene, complete cds 
gif 166101|gb|M25640|AUHP [166101] 

(View GenBank report,FASTA reporuASN.l report,Graphical view.l MEDLINE link, 1 protein link, or 13 nucleotide neighbors ) 
M10631 

Bacteriophage alpha-3 cleavage site for phage phi-X174 gene A protein 
gi|166099|gb|M10631|AL3CSA [166099] 

(View GenBank repon,FASTA rcporUSN.l report,Graphical view.l MEDLINE link, 1 protein link, or 3 nucleotide neighbors ) 
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Bacteriophage lambda, complete genome 
gi|2l5104|gb|J02459|LAMCG [215104] 

(View GenBank report,FASTA report,ASN.l report,Graphical view,87 MEDLINE links, 67 protein links 190 nucleotide 
neighbors, or 1 genome link ) ' 

J02482 

Bacteriophage phi-X174, complete genome 
gi|216019|gb[J02482|PXlCG [216019] 

(View GenBank repon.FASTA report, ASN.l report,Graphical view,23"MEDLINE links, 1 1 protein links, 26 nucleotide neighbors 
or 1 genome link ) 

J02454 

Bacteriophage G4, complete genome 
gi|2 1 54 1 5|gb|J02454|PG4CG [2 1 54 1 5] 

(View GenBank report,FASTA report,ASN.l rcport,Graphical view,6 MEDLINE links, 1 1 protein links, 20 nucleotide neighbors 
or 1 genome link ) 

X60323 

Bacteriophage phiK complete genome 

gi| 14781 l8|emb|X603231BPHIKCG [1478118] 

(View GenBank report,FASTA report.ASN.1 report,Graphical view, 1 0 protein links, 18 nucleotide neighbors, or 1 genome link ) 
L42820 

Bacteriophage BF23 tail protein (hrs) gene, complete cds 
gi|1048680jgb|L42820|BBFHRS [1048680] 

(View GenBank report,FASTA reportASN.l report,Graphical view.l MEDLINE link, 1 protein link, or 1 nucleotide neighbor j 
X54455 

Bacteriophage BF23 gene 17 and gene 18 
gi| 14797|emb|X54455|BF23 1718G [14797] 

(View GenBank report,FASTA reportASN.l report,Graphicai view,2 protein links, or 2 nucleotide neighbors ) 
M37097 

Bacteriophage BF23 DNA, right end of terminal repetition 
gi|1661 15|gb|M37097|BBFRIGH [166115] 

(View GenBank report,FASTA reporUSN.l report,Graphical view.l MEDLINE link, or 2 nucleotide neighbors ) 
M37096 

Bacteriophage BF23 DNA. left end of terminal repetition 
. gi|166114|gb|M37096|BBFLEFT [166114] 
(View GenBank repon,FASTA report^ASN.l report,Graphical view, I MEDLINE link; or 1 nucleotide neighbor ) 

M37095 

Bacteriophage BF23 A2-A3 gene, complete cds, and Al gene, 5' end 
gi|166H0|gb|M37095|BBFA2A3 [166110] 

(View GenBank report,FASTA report^SN.l report,Graphicai view,2 MEDLINE links, 3 protein links, or 1 nucleotide neighbor ) 
AF056281 

Bacteriophage BF23 clone bf23.mac5/6.1, genomic survey sequence 

gi|3090930igb|AF056281|AF056281 [3090930] 

(View GenBank report,FASTA report^SN.l report, or Graphical view) 
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AFO56280 

B f^° phagC BF23 Cl0nc bf2 3.mac3, genomic survey sequence 

gi|3090929|gb|AF056280|AF056280 [3090929] 

(View GenBank report.FASTA report,ASN.I report, or Graphical view) 

AF056279 

B ™?ofl e BF23 ° l0ne bf2 3.macl8/21.34, genomic survey sequence 
gi!3090928|gb|AF056279|AF056279 [3090928] 

(View GenBank report.FASTA report^SN. 1 repon, or Graphical viewf 
AF056278 

B ^nn^fl C BF23 Cl0ne bf23 -m«16/19.33, genomic survey sequence 
gt!3090927|gblAF056278|AF056278 [3090927] 

(View GenBank report.FASTA report,ASN.i report, or Graphical view) 
AF056277 

Ba ^o^f gC BF23 d0ne bf23 -**cl6719-33, genomic survey sequence 
gi|3090926|gb|AF056277|AF056277 [3090926] 

(View GenBank report,FASTA report,ASN. 1 report, or Graphical view) 
AF056276 

Bacteriophage BF23 clone bi23.macI2/9-9, genomic survey sequence 

gii3090925|gb|AF056276|AF056276 [3090925] 

(View GenBank report,FASTA reporUSN.l report, or Graphical view) 

AF056275 

B f, C ^ phase BF23 cionc bf2 3.macl 1/14-24, genomic survey sequence 

gi|3090924|gb|AF056275|AF056275 [3090924] 

(View GenBank report,FASTA reporOSN.l report, or Graphical view) 

AF056274 

Bacteriophage BF23 clone bf23.57r64r, genomic survey sequence 
gi|3090923|gb|AF056274|AF056274[3090923J 

(View GenBank rcport,FASTArepor^ASN.l report,Graphical view, or3 nucleotide neighbors ) 
AF056273 

Bacteriophage BF23 clone bf23.54fr, genomic survey sequence 

gi!3090922|gb|AFO56273|AF056273 [3090922] 

(View GenBank report.FASTA report,ASN.l report, or Graphical view) 

AF056272 

Bacteriophage BF23 clone bf23.47fr.mac 10/7, genomic survey sequence 

gi|3090921|gbiAF056272|AF056272 [3090921] 

(View GenBank reportJASTA reporUSN.I report, or Graphical view) 

AF056271 

Bacteriophage BF23 clone bt73.23.66r, genomic survey sequence 

gi|3090920|gb|AF056271|AF056271 [3090920] 

(View GenBank report,FASTA reportASN.l report, or Graphical view) 

AF056270 

Bacteriophage BF23 clone bf23.23.64f, genomic survey sequence 

gi|3090919|gb|AF056270|AF056270 [3090919] 

(View GenBank report.FASTA report,ASN.l report, or Graphical view) 
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AF056269 

Bacteriophage BF23 clone ba3.23.60r, genomic survey sequence 
gi|30909 1 8|gb|AF056269|AF056269 [309091 8] 

(View GenBank report,FASTA report,ASN.l report, or Graphical view) 
AF056268 

Bacteriophage BF23 clone bf23.23.60f, genomic survey sequence 
gi|3090917|gb|AF056268|AF056268 [3090917] 

(View GenBank report,FASTA reporUSN.l report.Graphicai view, oM nucleotide neighbor ) 
AF056267 

Bacteriophage BF23 clone bf23.23.59r, genomic survey sequence 
gi|30909 1 6|gb|AF056267|AF056267 [30909 1 6] 

(View GenBank repon,FASTA report, ASN.l report, or Graphical view) 
AF056266 

Bacteriophage BF23 clone bf23.23.59f, genomic survey sequence 
gi|3090915|gb|AF056266|AF056266 [3090915] 

(View GenBank report,FASTA report, ASN.l report, or Graphical view) 
AF056265 

Bacteriophage BF23 clone bf23.23.56r ( genomic survey sequence 
gi|3090914|gb|AF056265|AF056265 [3090914] 

(View GenBank report,FASTA rcport^SN.l report, or Graphical view) 
AF056264 

Bacteriophage BF23 clone bf23.23.56f; genomic survey sequence 

gi|3090913|gb|AF056264|AF056264 [3090913] 

(View GenBank report,FASTA reporv\SN.l report, or Graphical view) 

AF056263 

Bacteriophage BF23 clone bf23.23.68f35r, genomic survey sequence 
gi|3090912|gb(AF056263|AF056263 [3090912] 

(View GenBank report,FASTA report^SN.l report, or Graphical view) 
AF056262 

Bacteriophage BF23 clone bf23.23.43fr.66f, genomic survey sequence 
gi|309091 l|gb|AF056262|AF056262 [3090911] 

(View GenBank report,FASTA report,ASN.l report, or Graphical view) 
AF056261 

Bacteriophage BF23 clone bf23.23.2n\ genomic survey sequence 
gi|3090910|gb|AF056261|AF056261 [3090910] 

(View GenBank report^ASTA report^SN. 1 report, or Graphical view) 
AF056260 

Bacteriophage BF23 clone bf23.23.55.f; genomic survey sequence 
gi|3090909|gb|AF056260|AF056260 [3090909] 

(View GenBank report,FASTA report.ASN. 1 report, or Graphical view) 
AF056259 

Bacteriophage BF23 clone bf23.23.53.r, genomic survey sequence 
gi|3090908|gb|AF056259|AF056259 [3090908] 

(View GenBank report,FASTA report^SN. 1 report, or Graphical view) 
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AF056258 

Bacteriophage BF23 clone bf23.23.53.f, genomic survey sequence 
gi|3090907|gb|AF056258IAF056258 [3090907] 

(View GenBank report.FASTA report, ASN.l report, or Graphical view) 
AF056257 

Bacteriophage BF23 clone b£23.23.52.r, genomic survey sequence 
gi|3O909061gb|AF056257|AFO56257 [3090906] 

(View GenBank report,FASTA report,ASN.l report, or Graphical view)" 
AF056256 

Bacteriophage BF23 clone bf23.23.52.f, genomic survey sequence 
gi|3090905|gb|AF056256|AF056256 [3090905] 

(View GenBank report,FASTA report,ASN. 1 report, or Graphical view) 
AF056255 

Bacteriophage BF23 clone bf23.23.49.r, genomic survey sequence 
gi|3090904|gbiAF056255!AF056255 [3090904] 

(View GenBank report,FASTA report,ASN.l report, or Graphical view) 
AF056254 

Bacteriophage BF23 clone bf23. 23.49.1", genomic survey sequence 

gi|3090903|gb|AF056254|AF056254 [3090903] 

(View GenBank repoaFASTA report^ASN.l report, or Graphical view) 

AF056253 

Bacteriophage BF23 clone bf23.23.48j; genomic survey sequence 
gi|30909O2|gb|AFO56253|AFO56253 [3090902] 

(View GenBank report^ASTA repor^ASN. 1 report, or Graphical view) 
AF056252 

Bacteriophage BF23 clone bf23.23.48.f, genomic survey sequence 
gi|3090901|gb|AF056252|AF056252 [3090901] 

(View GenBank report,FASTA reportASN.l report, or Graphical view) 
AF056251 

Bacteriophage BF23 clone bf23.23.44.r, genomic survey sequence 
gi|3090900|gb[AF05625 1 IAF05625 1 [3090900] 

(View GenBank report,FASTA reporiASN.l report, or Graphical view) 
AF056250 

Bacteriophage BF23 clone b£23.23.41.f, genomic survey sequence 
gi|3090899|gb|AF056250]AF056250 [3090899] 

(View GenBank reportJASTA reporiASN. 1 report, or Graphical view) 
AF056249 

Bacteriophage BF23 clone bf23.23.22.a.r, genomic survey sequence 
gi|3090898|gb|AF056249|AF056249 [3090898] 

(View GenBank report,FASTA reporvASN.l report, or Graphical view) 
AF056248 

Bacteriophage BF23 clone bf23.23.22.a.f t eenomic survey sequence 
gi|3090397|gb|AF056248!AF056248 [3090897] 

(View GenBank report,FASTA reporVASN. I report, or Graphical view) 
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Bacteriophage BF23 clone bf23.23.68.r, genomic survey sequence 
gi|3090896|gb|AF056247|AF056247 [3090896] 
(View GenBank rcpon, FASTA report.ASN. 1 report, or Graphical view) 

2501 14 

Bacteriophage BF23 DNA for putative tail protein gene 
gi|2464952|emb|2501 14|BF23LATE [2464952] 

(View GenBank repon,FASTA repon,ASN. 1 rcport,Graphical view, 0J rJ protein link ) 
D 12824 

(VWW ° CnBank ~P° n ' FASTA -port.ASN.1 report,Graphical view.l MEDLINE link, 2 protein lu*s, or 3 nucleotide neighbors ) 
Z34953 

Bacteriophage K3 i P 9, ip7 and ip8 genes 
gi|53526 1 |emb|234953 |MYK3IP978 [53526 1 ] 

(View GenBank report,FASTA report,ASN. 1 ^Graphical view, 1 MEDLINE link, 3 protein links, or 1 nucleotide neighbor ) 
Z35075 

Bacteriophage BC3 DNA for Ip3 and Ip4 
gi!535229|embIZ35075|MYEORF64K [535229] 

(View GenBank report,FASTA repor^ASN. I report,Graphical view, 1 MEDLINE link, or 2 protein links ) 
X05560 

Bacteriophage K3 gene 38 for receptor recognizing protein 
gi|l5H2|emb|X05560|MYK3G38 [15112] 

(View GenBank rcport,FASTA reporvASN. I report,Graphical view.l MEDLINE link, or 1 protein link) 
X04747 

Bacteriophage K3 gene 37 for receptor recognizing protein 
giIl5I10|emb|X04747|MYK3G37 [15110] 

(View GenBank report,FASTA report^ASN. I rcport,Graphical view, I MEDLINE link, 1 protein link, or 2 nucleotide neighbors ) 
X01754 

Bacteriophage K3 tail fiber gene 36 
gi|15108|emb|X01754|MYK3F36 [15108] 

(View GenBank report,FASTA report^SN. 1 report,Graphical view.l MEDLINE link, or 2 protein links ) 
M 168 12 

Bacteriophage K3 V lysis gene, complete cds 
gi|2 1 5503|gb|M 1 68 1 2[PK3LYST [2 1 5503] 

(View GenBank report,FASTA rcport^SN.l report,Graphical view/l MEDLINE link, 1 protein link, or 4 nucleotide neighbors ) 
L46833 

Bacteriophage K3 frd3, frd2 genes, complete cds 
gi|95I377|gb|L46833|PK3FRD32G [951377] 

(View GenBank repon.FASTA reporUSN. 1 report,Graphical view,2 protein links, or 2 nucleoride neighbors ) 
L43613 

Bacteriophage K3 fibritin (wac) gene, complete cds 
gi|90386i|gb|L43613|PK3WAC [903861] 

(View GenBank report,FASTA report^SN.l report,Graphical view,l protein link, or 4 nucleoride neighbors ) 
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X01753 

Bacteriophage 0x2 tail fiber gene 36 
gi|15122|emb|X01753|MYOX2F36 [15122] 

(View GenBank repon.FASTA repon,ASN, repon,Graphica, view,, MEDLINE link. 2 protein links , or , nucleotjde nejohbor 
L43612 

Bacteriophage Ox2 fibritin (wac) gene, complete cds 
gii903848|gb|L43612|OX2WAC [903848] 

(View GenBank repon.FASTA report,ASN. 1 repon.Graphical view, 1 protein link, or 4 nucleotide neighbors ) 
246880 
Bacteriophage OX2 stp gene 
gi(599663|emb|Z4 6880|BPOX2STP [599663] 

(View GenBank repon.FASTA repor,ASN. 1 repo«.Graphical view, 1 MEDLINE link. , prot ein .ink, or 4 nuc.eotide neighbors ) 
X05675 

(View GenBank repon.FASTArepor.ASN, repon.Graphical view, , MEDLINE link, 3 protein links, or 1 nuclide neighbor , 

M33533 

(V,ew GenBank repon,FASTA repor,ASN. 1 repor,Graphica. view, , MEDLINE link. 2 protein link,, or 2 nucleotide neighbors , 
AF033329 

(V.ewGenBankrepon.FASTA report^SN.I report,Graphical view.l protein link, or 11 nucleotide neighbors ) 
M86231 

gi|215354|gb|M86231|P6962REGA [215354] 

(View GenBank report,FASTA repor^SN.l repon,Graphical view, 1 MEDLINE link. 2 protein links, or 1 nucleotide neighbor ) 
AF033332 

(V,ew GenBank report.FASTA report^SN.I ^Graphical view, 1 protein link, or 12 nucleotide neighbors ) 
U34036 

Bacteriophage RB69 DNA polymerase (43) gene, complete cds 
gi|1237125|gb|U34036|BRU34036 [1237125] mpW,ecas 

(V,ew GenBank reponJASTA report^SN.l report,Grapbical view.l MEDLINE link, or 1 protein link ) 
V01145 

*l££t$Sl£5£g" Each ^ 8iven fa *" sequeace »«-— a 

gi|15557|emb|V01 145|PODOHl [15557] 

(View GenBank report,FASTA report^SN.l report,Graphical view, or 1 MEDLINE link ) 
X05676 

5m ?Sf ha !f v ^iA? C 38 for receptor ""Baaing protein and flanking regions 
gi|15114|emb|X05676|MYMlG38 [15114] 

(View GenBank reportJASTA report^SN.l report,Graphical view.l MEDLINE link, 3 protein links, or I nucleotide neighbor ) 
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AF034575 

(V,ew GenBank report,FASTA repon.ASN.1 repon,Graphical view.l MEDLINE link, or 1 protein link ) 
AF033321 

(View GenBank report,FASTA report.ASN.1 report.Graphical view,! protein link, or 17 nucleotide neighbors ) 
X5S190 

(V,ew GenBank report,FASTA re P ort,ASN.l repon.Graphica! view,! MEDLINE link, 2 protein links, or 2 nucleotide neighbors ) 
AF033334 

Bacteriophage Tulb single-stranded binding protein (gene 32) gene, partial cds and 5' re«on 
gi|2645798|gb|AF033334|AF033334 [2645798] 8 
(View GenBank report,FASTA report^ASN. 1 report,GraphicaI view, or 5 nucleotide neighbors ) 

X55191 

a B „ a dS^uS 8eDe reCePt0r - reC ° ^ Pr0tei ° " 38 ■« fcr recepror-recognizing protein 38. 

gi|14863|emb|X55191|BPTUIB [14863] 

(View GenBank repooFASTA report,ASN. 1 report,Graphical view, 1 MEDLINE link, 3 protein links, or 3 nucleotide neighbors ) 
X13065 

Bacteriophage phi80 early region 
gi|I48001emb|X13065|BP80ER [14800] 

(View GenBank report,FASTA report,ASN.l report,Graphical view.l MEDLINE link, 8 protein links, or 6 nucleotide neighbors ) 
D00360 
Bacteriophage phi80 cor gene 
gi|217782|dbj|D00360|P8080COR [217782] 

(View GenBank report,FASTA report,ASN.l report,GraphicaI view, or I protein link ) 
X01639 

Bacteriophage phi 80 DNA-fragment with replication origin 
gi|15828|emb|X01639|XXPHI80 [15828] 

(View GenBank report,FASTA repor^ASN.l report,Graphical view, 1 MEDLINE link, or 25 nucleotide neighbors ) 
X04051 

Lambdoid bacteriophage phi 80 int-xis region (integrase-excisionase reeion) 
gi|15770iemb|X04051|STPHI80X [15770] 

(View GenBank report,FASTA reporUSN.l report,Graphicai view, I MEDLINE link, 2 protein links, or 1 nucleotide neighbor ) 
X06751 

Phage Phi80 DNA for major coat protein 
gi| 1 5768|emb|X0675 1 (STPHI80C [15768] 

(View GenBank report,FASTA reparUSN.l report,Graphical view, I MEDLINE link, 1 protein link, or II nucleotide neighbors ) 
X75949 

Bacteriophage phi80 DNA for ORF xI71.S and ORF x!71 28" 
gi|45881I|emb|X75949|ECORF17lB [458811] 

(View GenBank report^ASTA report,ASN.l repor^Graphical view.l MEDLINE link, 2 protein links, or 28 nucleotide neighbors ) 
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L404I8 

Bacteriophage phi-80 gene, complete cds 
gill 0191 07|gb|L40418|P80A [1019107] 

(View GenBank repon,FASTA report.ASN. 1 report.Graphical view.l MEDLINE link, or 1 protein link ) 
M24831 

Bacteriophage phi-80 Tyr-tRNA gene, 3' end 
gi|2 1 5363|gb|M2483 1 [P80TGY [2 15363] 

(View GenBank repon,FASTA reporuASN.l report.Graphical view, I MEDLINE link, or 43 nucleotide neighbors ) 
Ml 0670 

Bacteriophage phi-80 replication origin 
gi»2 1 5361|gb|M 1 0670|P80ORJ (2 1536 1 ] 

(View GenBank report,FASTA rcport.ASN.1 report,Graphical view.l MEDLINE link. 1 protein link, or 1 nucleotide neighbor ) 
M24825 

Bacteriophage phi-80 RNA fragment 
gi|215360|gb|M24825|P80M3A [215360] 

(View GenBank report,FASTA report^ASN.I report,Graphical view, 1 MEDLINE link, or 1 nucleotide neighbor ) 
MU919 

Bacteriophage phi-80 cl immunity region encoding the N gene 
gi|215358|gb|MU919|P80a [215358] 

(View GenBank report.FASTA report^SN. 1 report,Graphical view,l MEDLINE link, 1 protein link, or 2 nucleotide neighbors ) 
M10891 

Bacteriophage phi-80 attP site DNA 
gi|2l5357|gb|M10891|P80ATTP [215357] 

(View GenBank reportJASTA rcpon^SN. 1 report,Graphical view, I MEDLINE link, or 1 nucleotide neighbor ) 
M19473 

Bacteriophage 933J (from E.coli) proviral Shiga-like toxin type 1 subunits A and B genes, complete cds 
gi|2 15072|gb|MI9473|J93SLTZ [215072] 

(View GenBank report,FASTA reporUSN.l report,GraphicaI view,2 MEDLINE links, 2 protein links, or 20 nucleotide neighbors ) 
Y10775 

Bacteriophage 933 W ileX, stx2A and stx2B genes 
gi|1938206|emb|Yl0775|BP933ILEX [1938206] 

(View GenBank report.FASTA reporUSN.l report,Graphical view,2 protein links, or 36 nucleotide neighbors ) 
X83722 

Bacteriophage 933W slt-UB gene 

gtl U90229|emb|X83722|B933WSLT [1490229] 

(View GenBank reportJASTA reporUSN. 1 report,Graphical view,2 protein links, or 20 nucleotide neighbors ) 
X07865 

Bacteriophage 933 W slt-II gene for Shiga-like toxin typell subunit A and B 
gi|14892|emb]X07865!BWSLTn [14892] 

(View GenBank report,FASTA report^SN.l report,Graphical view,2 protein links, or 29 nucleotide neighbors ) 
M16625 

Bacteriophage HI 9B (from E.coli) sltlA and sltTB genes encoding Shiga-like toxin I subunits A and B, complete cds 
gi|215043|gb|M16625|Hl9BSLT [215043] 

(View GenBank report,FASTA reporUSN.l rcport,Grapbical view,l MEDLINE link, 2 protein links, or 24 nucleotide neighbors ) 
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Bacteriophage H 1 9B shiga-like toxin- 1 (SLT- 1) A and B subunit DNA, complete cds 
gij2 1 5046|gb|M 1 7358|H I9BSLTA [215046] 

(View GenBank repon,FASTA report,ASN. 1 report,Graphical view, 1 MEDLINE link, 2 protein links, or 20 nucleotide neighbors ) 
U29728 

Bacteriophage N4 single-stranded DNA-binding protein (N4SSB) gene, complete cds 
gi|939708|gb|U29728|BNU29728 [939708] 

(View GenBank reportJASTA report,ASN.l report,Grapbical view t 2 MEDLINE links, or 1 protein link ) 
J02580 

Bacteriophage PA-2 (E.coli porcine strain isolate) Rz gene, 5'end; ORF2, outer membrane porin protein (1c) and ORF1 genes 
complete cds 

gi|2i5366|gb|J02580|PA2LC [215366] 

(View GenBank report,FASTA repon.ASN.1 report,Graphical view, 1 MEDLINE link, 4 protein links, or 4 nucleotide neighbors ) 
U32222 

Bacteriophage 186, complete sequence 
gi|3337249]gb|U322221B!U32222 [3337249] 

(View GenBank report,FASTA report,ASN. 1 repon,Graphical view,6 MEDLINE links, 46 protein links, or 5 nucleotide neighbors ) 
X51522 

Bacteriophage P4 complete DNA genome 
gi|450916|emb|X51522|MYP4CG [450916] 

(View GenBank repon^ASTA report,ASN. 1 report,Graphical view,3 MEDLINE links, 13 protein links, 6 nucleotide neiehbo-s 
or 1 genome unk ) 5 

X92588 

Bacteriophage 82 orf33, orflSl, orf56, orf96, rus, orf45, and Q genes 
gi|1051 1 1 l|emb|X92588|BAC82HOLL [105 11 1 1] 

(View GenBank report.FASTA report^\SN. 1 report.Graphical view,7 protein links, or 1 nucleotide neighbor ) 
J02803 

Bacteriophage 82 antiterminarion protein (Q) gene, complete cds 
gi|2!5364|gb|J02803|P82Q [215364] 

(View GenBank report.FASTA report,ASN.l report,Graphical view,l MEDLINElink, or 1 protein link ) 
U02466 

Bacteriophage HK022 (cro), (ell) and (O) genes, complete cds, (P) gene, partial cds 
gi|407285|gb|U02466|BHU02466 [407285] 

(View GenBank report,FASTA report^SN. 1 report,Graphical view.l MEDLINE link, 5 protein links, or 1 nucleotide neighbor ) 
M26291 

Bacteriophage D 1 08 regulatory DNA-binding protein (ner) gene, complete cds 
gi|166194|gb|M26291|D18NER [166194] 

(View GenBank report^ASTA reporMiSN.l report,Graphical view,l MEDLINE link, 1 protein link, or 1 nucleotide neighbor ) 
Ml 1272 

Bacteriophage D 108 left-end DNA 
gi|166I93|gb|M11272|Dl8LEDNA [166193] 

(View GenBank rcport,FASTA report,ASN.l report,Graphical view,! MEDLINE link, or 2 nucleotide neighbors ) 
Ml 8902 

Bacteriophage D 108 kil gene encoding a replication protein, 3* end; and containing three ORFs, complete cds 
gi| 1 66 1 9 1 [gb|M 1 8 902|D 1 8KIL [166191] 

(View GenBank repon,FASTA report^SN.l report,Graphical view, I MEDLINE link, 1 protein link, or 3 nucleotide neighbors ) 
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M10191 

Bacteriophage D!08, left end with Mu A protein binding sites LI and L2 
gi| 1 66 1 90|gb|M 1 0 1 9 1 |D 1 8BSL [1661 90] 

(View GenBank repon,FASTA report,ASN. 1 report, Graphical view, 1 MEDLINE link, or 5 nucleotide neighbors ) 
J02447 

bacteriophage d 1 08 gene a 5* end 

gi| 1 66 1 89|gb|J02447|D 1 8 AAA ( 1 66 1 89] 

(View GenBank report,FASTA reporv\SN.l report,Graphical view, of 1 MEDLINE link ) 
V00865 

Bacteriophage D108 fragment from genes A and ner (C- terminus of ner and N-terminus of A) 
gi|15437|emb|V00865|NCD108 [15437] 

(View GenBank report.FASTA report,ASN. 1 repon,Graphical view, 1 MEDLINE link, or 2 protein links ) 
X01914 

Bacteriophage IKe gene for DNA binding protein 
gi|14957temb|X01914|INIKEDBP [14957] 

(View GenBank reportJASTA reportASN.i report,GraphicaI view.l MEDLINE link, I protein link, or 2 nucleotide neighbors ) 

AF064539 
Bacteriophage N 15, complete genome 
gi|3 1 92683|gb| AF064539) AF064539 [3 1 92683] 

(View GenBank report,FASTA reporUSN. 1 report,Graphical view,2 MEDLINE links, 60 protein links, 26 nucleotide neighbors 
or l genome unJc ) s 

U02303 

Bacteriophage Ifl, complete genome 
gi|3676280|gb|U02303|B2U02303 [3676280] 

(View GenBank report,FASTA repoi^ASN. 1 report,Graphical view, 1 0 protein links, or 1 genome link ) 
AF007792 

Bacteriophage Mu late morphogenetic region 
gi|3551775|gb|AF007792|AF007792 [3551775] 

(View GenBank report,FASTA reporUSN.l report,Graphical view, or 1 nucleotide neighbor ) 
U24159 

Bacteriophage HP 1 strain HPlcl, complete genome 
gi|1046235|gb|U24159|BHU24159 [1046235] 

(View GenBank report,FASTA reportASN.l report,Graphical view,6 MEDLINE links, 41 protein links, 8 nucleotide neighbors 
or 1 genome link ) 

Z71579 

Bacteriophage S2 type A 5.6 kb DNA fragment 
gi|l679806|emb|Z71579[BPHSlADNA [1679806] 

(View GenBank report,FASTA report^SN. 1 report,Graphical view,3 MEDLINE links, 9 protein links, or 9 nucleotide neighbors j 
X53238 

Klebsiella sp. bacteriophage Kll gene 1 for RNA polymerase 
gi|14984|emb|X53238|KSKHRPO [14984] 

(View GenBank report,FASTA rcport^SN.l report,GraphicaI view, 1 MEDLINE link, 1 protein link, or 1 nucleotide neighbor ) 
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Bacteriophage A51 1 ply51 1 gene 
gi|853748|emb|X85010|BPA511PLY [853748] 

(View GenBank report,FASTA repon,ASNM repon.Graphical view. 1 MEDLINE link, 3 protein link,, or 1 nucleotide neighbor ) 
U29728 

Bacteriophage N4 single-stranded DNA-binding protein (N4SSB) gene, complete cds 
gi|939708|gb|U29728|BNU29728 [939708] P 

(View GenBank report,FASTA repon^SN. 1 report,Graphical view,2 fteDLINE links, or 1 protein link ) 
J02445 

bacteriophage bol 3 -terminal region rna 
gi|I66152|gb|J02445|BOITR3 [166152] 

(View GenBank report,FASTA report,ASN. 1 repon.Graphical view, 1 MEDLINE link, or 5 nucleoride neighbors ) 
L06I83 

Bacteriophage L5 (from Leuconostoc oenos) genome 
gi|289353igb|L06183|BL5GENM [289353] 

(View GenBank report^ ASTA report^iSN. 1 report,Graphical view, or 1 genome link ) 
AF074945 

Mycoplasma arthritidis bacteriophage MAV1, complete genome 
gi|3511243|gb|AF074945|AF074945 [351 1243] 

(View GenBank report,FASTA report^SN. 1 report,Grapmcal view, 15 protein links, 3 nucleotide neighbors, or 1 genome link ) 
L13696 

Bacteriophage L2 (from Mycoplasma), complete genome 
gi|289338|gb|Ll3696|BL2CG [289338] 

(View GenBank report,FASTA report^SN.l report,Graphical view,3 MEDLINE links, 14 protein links, or 1 genome link ) 
X80191 

Bacteriophage PP7 mRNA for maturation, coat, lysis and replicase proteins 
gi|5I7237|emb|X80191|BPP7PR [517237] 

(View GenBank report,FASTA report^SN.l report,Graphical view, 1 MEDLINE link, 4 protein links, or 1 genome link ) 
M19377 

B ^Tc»^w P ,? fr ° m Pseud °monas aeruginosa (New York strain), complete genome 
gi|2I5380|gb|M19377|PF3COMNY [215380] 

(View GenBank report,FASTA report^SN.l report,Graphical view,! MEDLINE link. 9 protein links, or 5 nucleoride neighbors ) 
M119I2 

B -n ™?!^5? /° from Pscu domonas aeruginosa (Nijmegen strain), complete genome 
gi|2l5371|gb|MU912|PF3COMN [215371] 

g«ome G n^ r ^ ^°^ A ^Graphical view.l MEDLINE link, 9 protein Unks, 5 nucleotide neighbors, or 1 

V00605 

Bacteriophage Pfl gene encoding DNA binding protein 
gi| 14970|emb|V00605|INOPF 1 [ 1 4970] 

(View GenBank report,FASTA report,ASN.l report,Graphical view, 1 proteine link, or 1 nucleotide neighbor ) 
L05626 

Bacteriophage PR4 capsid protein (P6) gene, complete cds 
gi|215735|gb|L05626|PR4P6MAJA [215735] - 

(View GenBank report,FASTA report^SN.l report,Graphical view.l MEDLINE link, 1 protein link, or 1 nucleotide neighbor ) 
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D13409 



(View GenBank report,FASTA re P on,ASN. 1 re P or,Graphica. vie., MEDLINE hnk. 3 protein hnks. or 3 nuclide ne.ghbors ) 



01 3408 



B "^P^Sf phiCTX (isolated from Pseudomonas aeruginosa) cosL, eft genes 
gi|217775|dbj|D13408|BPHCOSLCTX [217775] * 

(V,ew GenBank report.FASTA report.ASN. 1 repor,Graphica. view.2 tfEDLINE links, or 3 nucleotide neighbors ) 
M24832 

Bacteriophage f2 coat protein gene, partial cds 
gi|166228|gb|M24832|F2CRNACA [166228] 

(View GenBank report.FASTA report, ASN. 1 ^Graphical view, 1 MEDLINE link. 1 protein link, or 4 nucleotide neighbors ) 
S72011 

Bacteriophage 21 isocitrate dehydrogenase (icd) and integrate (int) eenes oartial erfs 
gi|2618967|gb|AF017629|AF017629 [2618967] 1 ' gCaM ' pamal cds 

(View GenBank report,FASTA report.ASN.1 report,Graphical view.l MEDLINElink, 2 protein links, or 44 nucleotide neiehbors ) 
AF0I7628 

B ^ C , S I', ?ii socitrate dehydrogenase (icd) and integrate (int) genes, partial cds 
gi|2618964|gb|AF017628|AF0l7628 [2618964] .pamaicos 

(View GenBank report.FASTA report,ASN.l report,Graphical view. 1 MEDLINElink, 2 protein links, or 44 nucleotide neighbors ) 
AFO 17627 

Bacteriophage 21 isocitrate dehydrogenase (icd) and integrase (int) eenes oartial cds 
gi|26i8961|gb|AF017627|AF017627 [2618961] P 

(View GenBank rcport f FASTA repor^ASN. 1 report,Graphical view, 1 MEDLNElink, 2 protein links, or 44 nucleotide neighbors ) 
AF017626 

(V.ew GenBank report,FASTA report^SN. 1 repon,Graphical view.l MEDLINE link, 2 protein links, or 49 nucleotide neighbors ) 
AFO 17625 

Bacteriophage 2 1 isocitrate dehydrogenase (icd) and integrase (int) genes narrial cds 
gi|2618955|gb|AF017625|AF0I7625 [2618955] genes, pamal cds 

(View GenBank repon,FASTA repon^SN.l report,Gra P hical view.l MEDLINElink, 2 protein links, or 44 nucleotide neighbors ) 
AFO 17624 

B ™^ 8 u. 2 li SOCittatt dehydrogenase (icd) and integrase (int)genes, partial cds 
gi|26l8952|gb|AF017624|AF017624 [2618952] 

(View GenBank reportfASTA reporUSN.l report,Graphieal view.l MEDLINElink, 2 protein links, or 44 nucleotide neighbors ) 
AFO 17623 

B ™S«f 8 u. 2 1 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gi|2618949|gb|AF017623|AF017623 [2618949] 

(View GenBank report,FASTA report^SN.l rep 0 rt,Graphical view, I MEDLINE link. 2 protein links, or 44 nucleotide neighbors ) 
AF017622 

B ^I C ,^ a8 . e 21 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gi|2618946|gb|AF017622|AF017622 [2618946] P 

(View GenBank report,FASTA report.ASN.1 report,Graphical view.l MEDLINE link, 2 protein links, or 44 nucleotide neighbors ) 
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Bacteriophage 2 1 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gi|26 1 8943|gb| AFO 1 762 1 |AP0 1 762 1 [2618943] 

(View GenBank re P ort,FASTA report,ASN.l report,Graphical view. 1 MEDLrNE link, 2 protein links, or 44 nucleotide neighbors ) 
D26449 

(View GenBank report.FASTA report^SN.l report,Graphical view, or 2 protein links ) 
X87627 

Bacteriophage D3 1 12 A and B genes 
gii974768lemb|X87627|BPD3112AB [974768] 

(View GenBank report.FASTA report.ASN.l report,Graphical view.l MEDLINElink, 2 protein links, or 1 nucleotide neighbor ) 
U32623 

Bacteriophage D3 transcriptional activator CXI (ell) gene, complete cds 
gi|984852|gb|U32623|BDU32623 [984852] 

(View GenBank repon,FASTA report, ASN.l report,Graphical view, 1 protein link, or 1 nucleotide neighbor ) 
L34781 

(View GenBank report,FASTA reporUSN. I report,Graphical view,l MEDLINE link, 4 protein links, or 2 nucleotide neighbors ) 
L14810 

Bacteriophage P22 (gplO) gene, complete cds, and (gp26) gene, complete cds 
gi|294053|gb|L148 10|P22GP1026X [294053] 

(View GenBank reportJASTA reporWSN.l report,Graphical view,! MEDLINE link, 2 protein links, or 2 nucleotide neighbors ) 
X87420 

Bacteriophage ES 1 8 genes 24, c2, cro, c 1, 1 8, and oL and oR operators 
gi|ll43407|embiX87420fBPESI8GEN [1143407] 

(View GenBank report,FASTA report ASN.l report,GraphicaI view,5 protein links, or 9 nucleotide neighbors ) 
L42820 

Bacteriophage BF23 tail protein (hrs) gene, complete cds 
gi|1048680|gb|L42820|BBFHRS [1048680] 

(View GenBank report,FASTA reporUSN.l report, Graphical view, 1 MEDLINElink, 1 protein link, or 1 nucleotide neighbor ) 
X14980 

Bacteriophage PRD1 XV gene for protein P15 (lytic enzyme) 
gi|l5802|emb|X14980|TEPRDlXV [15802] 

(View GenBank report,FASTA reportASN.l report,Graphical view.l MEDLINElink, I protein link, or 4 nucleotide neighbors ) 
X06321 

Bacteriophage PRDl gene 8 for DNA terminal protein 
gi|15800[emb|X06321[TEPRD18 [15800] 

(View GenBank report,FASTA report,ASN. 1 report,Graphical view, 1 MEDLINE link, 2 protein links, or 10 nucleotide neighbors ) 
X14336 

Filamentous Bacteriophage 12-2 genome 
giil492G|cmb|X14336|INBI22 [14920] 

(View GenBank rcport,FASTA report,ASN.l report,Graphical view.l MEDLINE link, 9 protein links, 1 nucleotide neighbor or 1 
genome link) 6 
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Bacteriophage X glucosyl transferase gene, complete cds 
gi|2 1 6044|gb|L0500 1 [PXFCLUS YLT [2 1 6044] 

(View GenBank report t FASTA report. ASN.l report,Graphical view, 1 MEDLINE link, or 1 protein link ) 
M29479 

Bacteriophage p4 sid and psu genes partial cds, and delta gene, complete cds ei|2 157011 
gb|M29479|PP4SDP [215701] * 1 

(View GenBank report.FASTA report^SN. 1 report,Graphical view,3 protein links, or 4 nucleotide neighbors ) 

SEG_PP4PSUSID 
Bacteriophage P4 capsid size determination protein (sid) gene 5' end 
gi|215698|gb[|SEG_PP4PSUSID [215698] 

(View GenBank report.FASTA report,ASN. 1 report,Graphical view, I MEDLINE link, 2 protein links, or 1 nucleotide neighbor ) 
M29650 

Bacteriophage P4 polarity suppression protein (psu) gene, complete cds 
gi|2!5697|gb|M29650|PP4PSUSID2 [2 15697] 

(View GenBank report,FASTA report^SN.l report, or Graphical view) 
M2965I 

Bacteriophage P4 capsid size determination protein (sid) gene, 5' end 
gi|2 1 5696|gb|M2965 1|PP4PSUSID 1 [2 1 5696] 

(View GenBank report,FASTA report^ASN. 1 report, or Graphical view) 
M27748 

Bacteriophage P4 gop, beta, and ell genes, complete cds and int gene, 3* end 
gil215691|gb|M27748|PP4GOPBC [215691] 

(View GenBank report^ASTA report.ASN.1 report,Graphical view,l MEDLINE link, 4 protein links, or 1 nucleotide neighbor ) 
K02750 

Bacteriophage IKc, complete genome 
gi|2I5061|gb|K02750|IKECG [215061] 

(View GenBank report,FASTA repor^ASNM report,Graphical view, 1 MED LINE link, 10 protein links, 4 nucleotide neighbors or 1 
genome link ) 

L40418 

Bacteriophage phi-80 gene, complete cds 
gi|1019l07|gb|L40418|P80A [1019107] 

(View GenBank report^ ASTA report^SN. 1 report,Graphical view, 1 MEDLINE link, or 1 protein link ) 
AF032122 

Bacteriophage Sfll integrase (int) gene, partial cds; and bactoprcnol glucosyl transferase (bet), and glucosyl tranferase II (nail) 
genes,complete cds ■ \e / 

gi|24654l2|gbIAF021347|AF021347 [2465412] 

(View GenBank report,FASTA reporUSN.l report,Graphical view,l MEDLINE link, 4 protein links, or 2 nucleotide neighbors ) 
M35825 

Bacteriophage SF6 fragment D lysozyme gene, complete cds 
gi|2l6I05|gb|M35825ISF6LYZ [216105] 

(View GenBank report,FASTA report^ASN.l report,Graphical view, or 1 protein link ) 

235479 
Bacteriophage CI 6 ipl gene 
gi[534936|emb|Z35479|BC16IPl [534936] 

(View GenBank report^ASTA report^\SN.l report,Graphical view,l MEDLINE link, 1 protein link, or 2 nucleotide neighbors ) 
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XI 2638 

Bacteriophage 2 1 DNA for gene 2 

gi|296 14 ! |emb|X 1 263 8|B2 1 GENE2 [296 141] 

(View GenBank repori,FASTA report,ASN. 1 report,Graphical view, 1 MEDLINE link, i protein link, or 1 nucleotide neighbor ) 
X02501 

Bacteriophage 2 1 DNA for left end sequence with genes 1 and 2 
gi!15825|emb|X02501IXXPHA21 [15825] 

(View GenBank report,FASTA reporr,ASN.l report,Grapbical view.l MEDLINE link, 2 protein links, or 3 nucleotide neighbors ) 
M65239 

Bacteriophage 21 lysis genes S, R, and Rz, complete cds 
gil215466|gb|M65239|PH2LYSGEN [215466] 

(View GenBank report,FASTA report,ASN.l report,Graphical view.l MEDLINE link, 3 protein links, or 1 nucleotide neighbor ) 
M58702 

Bacteriophage 21 late gene regulatory region 
gi|215465|gb|M58702|PH2LATEGE [215465] 

(View GenBank report,FASTA reporiASN.l report,Graphical view, or 1 MEDLINE link ) 
M8I255 

Bacteriophage 21 head gene operon 
gi|215454|gb|M81255|PH2HEADTL [215454] 

(View GenBank report,FASTA report,ASN.l report,Graphical view,2 MEDLINE links, 10 protein links, or 4 nucleotide neighbors ) 
M23775 

Bacteriophage 2 1 glycoprotein 1 gene, complete cds, and glycoprotein gene, 5' end 
gi|215451|gb|M23775|PH2GPA [215451] 

(View GenBank report,FASTA report^SN. I report,Graphical view, 1 MEDLINE link, 2 protein links, or 3 nucleotide neighbors ) 
M61865 

Bacteriophage 21 excisionase (xis), integrase (int) and isocitrate dehydrogenase (icd), complete cds 
gi|215448|gb|M61865|PH22XISAA [215448] 

(View GenBank report,FASTA report,ASN. 1 report,Graphical view,2 protein links, or 9 nucleotide neighbors ) 
S72011 

Bacteriophage 2 1 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gi(26 1 8967|gb|AF0 17629|AF017629 [26 1 8967] 

(View GenBank report^ASTA report,ASN.l report,Graphical view,l MEDLINE link, 2 protein links, or 44 nucleotide neighbors ) 
AF017628 

Bacteriophage 21 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gi|2618964|gb|AF017628|AF017628 [2618964] 

(View GenBank report,FASTA report^ASN.l report,Graphical view.l MEDLINE link, 2 protein links, or 44 nucleotide neighbors ) 
AF0I7627 

Bacteriophage 21 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gi|2618961|gb|AF017627|AF017627 [2618961] 

(View GenBank report,FASTA reporvASN.l report,Graphical view,l MEDLINE link, 2 protein links, or 44 nucleotide neighbors ) 
AF017626 

Bacteriophage 21 isocitrate dehydrogenase (icd) gene, partial cds; and integrase (int) gene, partial cds 
gi|26 1 895 8fgb| AFO 1 762 6| AFO 1 7 626 [2618958] 

(View GenBank report,FASTA report^ASN.l report,Graphical view,l MEDLINE link, 2 protein links, or 49 nucleotide neighbors ) 
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AF017625 

(v«» « M «p 0n , FASTA 1WASN . 1 rep( , n . Q „ plicll >itw , MDUNE link 2 proiito ^ mM 

AFO 17624 

Bacteriophage 2 1 isocitrate dehydrogenase (icd) and inteerase (int) eenes oarrial «fc 
g"|2618952|gb|AF017624|AF017624 [2618952] ? 

(V,ew GenBank report,FASTA report,ASN. 1 report,Grapaica. view, , MEDLINE link. 2 protein Unks, or 44 nucleotide n=jghbors 
AFO 17623 

Bacteriophage 2 1 isocitrate dehydrogenase (icd) and integrase (inrt eenes narrial erk 
g'|2618949|gb|AF017623|AF017623 [2618949] S P 

(V,ew GenBank report,FASTA report.ASN.1 report,Gra P hica« view.l MEDLINE .ink, 2 protein links, or 44 nuc.eotide neighbors ) 
AFO 17622 

KSSBSaSKSSESS? - «n» 

(V«. a** reportJASTA ^OASN., repon,Graphical »„„, MEDLINE «. 2 ^ fc „ „ 

AF017621 

(V«w a^B^ reportpFASTA Kp^ASN.! , WO „pW„, MEDLWE B* J p™* ^ „„ 

M57455 



(View GcaBanJc report,FASTA report,ASN. 1 report,Graphical view,! protein link, or 9 nucleotide neighbors ) 
Y12633 

B ^Ac1^ g \ 85 DNA ' P rotnot «- sequence of unknown gene 

gi|2058285|emb|Y12633|B85PROM [2058285] 

(View GenBank report,FASTA reportASN. 1 report, or Graphical view) 

X98146 

Bacteriophage PI DNA sequence around the Op88 operator 
gill359513|emb|X98146|BP10P880P [1359513] 

(View GenBank report,FASTA repor^ASKl report,Graphical view, or 1 nucleotide neighbor ) 
Y07739 

Staphylococcus phage Twort hoITW, plyTW genes 
gi|2764979|cmb|Y07739|BPTWGHOLG [2764979] 

(View GenBank report^ASTA report^SN.l report.Graphical view, or 2 protein links ) 
L07580 

B B ^K7 580 ™ 

(V.ew GenBank repon,FASTA report^SN. 1 repon,Graphical view.l MEDLINE link, or 2 protein links) 



M34832 



^SjSJX Phi ' 1 1 k"*™* ( fa 0 and excisionase (xis) genes, complete cds 
gi|166157|gb|M34832|BPHINTXIS [166157] ^ 

(V,ew GenBank report,FASTA report^SN. 1 report.Grapbical view.l MEDLINE link, 2 protein links, or 2 nucleotide neighbors ) 
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M20394 

Bacteriophage phi- II S.aureus attachment site (attP) 
giil66156|gbtM20394|BPHATTP[I66156J 

(View GenBank repon t FASTA report,ASN.l report.Graphica! view, I -MEDLINE link, or 4 nucleotide neighbors ) 
X23I28 

Bacteriophage phi- 13 integrase gene 
gi|758228|emb|X823 1 2|PHI 1 3INT [758228] 

(View GenBank re P ort,FASTA repor^ASN. I report.Graphical view, I protein link, or 3 nucleotide neighbors ) 
X617I9 

S.aureus phi- 13 lysogen right chromosomeftacteriophaee DNA iunction 
gi|46625|emb|X61719|SAPI3RJNC[46625] J 

(View GenBank report.FASTA re P ort,ASN.l repon,Gra P hical view t or 1 MEDLINE link) 
X6I718 

S-aureus phi- 13 lysogen left chromosomai/bacteriophage DNA junction 
. gi|46624|emb|X61718|SAP13UNC [46624] J 
(View GenBank report,FASTA report,ASN. 1 repor^Graphical view, or I MEDLINE link ) 

X617I7 

Bacteriophage phi- 13 core sequence for attachment 
gi| 14799|emb|X6 1 7 1 7|BP 1 3ATTP [14799] 

(View GenBank repon,FASTA repor^SN.l report.Graphical view,2 MEDLINE links, or 3 nucleotide neighbors ) 
U0I875 

(V.ew GenBank re P ort,FASTA report,ASN.l repon,Gra P hical view,3 MEDLINE links, or 4 nucleotide neighbors ) 
X67739 

S.aureus Bacteriophage phi-42 attP gene 
gi|14809|emb|X67739|BPATTPA [14809] 

(View GenBank reportJASTA reporUSN.l report,Graphical view, 1 MEDLINE link, or 3 nucleotide neighbors ) 
U01872 

Bacteriophage phi-42 integrase (int) gene, complete cds 
gi|437 1 1 5|gb|U0 1 872|U0 1 872 [4371 1 5] 

(View GenBank report^ASTA repor^SN. 1 report,Graphical view,3 MEDLINE links, 2 protein links, or 3 nucleotide neighbors ) 
X94423 

(V,ew GenBank report,FASTA reporUSN.l report,Graphical view,2 protein links, or 1 nucleotide neighbor ) 
M27965 

Bacteriophage L54a (from S.aureus) int and xis genes, complete cds 
gi|215096|gb|M27965|L54INTXIS [215096] m P le «cas 

(V.ew GenBank report,FASTA reporUSN.l report,Graphical view, MEDLINE 1 link, 2 protein links, or 3 nucleotide neighbors ) 
U72397 

Bacteriophage 80 alpha holin and amidase genes, complete cds 
gi| 1 76324 l|gb|U72397|B8U72397 [1763241] 

(View GenBank report.FASTA report^SN.l repon.Graphical view,2 protein links, or 2 nucleotide neighbors ) 
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AB009866 

Bacteriophage phi PVL proviral DNA, complete sequence 
gi|334 1 907|dbj|AB009866| AB009866 [334 1 907] 

(View GenBank report.FASTA report,ASKl report.Graphical view,63 protein links, or 1 nucleotide neighbor ) 
Z47794 . 

Bacteriophage Cp-1 DNA, complete genome 
gi|2288892|emb|Z47794iBPCPlXX [2288892] 

(View GenBank report.FASTA report,ASN.l report,Graphical view,3 MEDLIN6 links, 28 protein links, i nucleotide neighbor or 
1 genome link ) 

SEG_CP7RSIT 

Bacteriophage Cp-7 (S.pneumoniae) 5' inverted terminal repeat 
gi|l66186|gb||SEG_CP7RSIT[166I86J 

(View GenBank report,FASTA report^SN.! report,Graphical view, or 1 MEDLINE link ) 
Ml 1635 

Bacteriophage Cp-7 (S.pneumoniae) DNA, 3' inverted terminal repeat 
gi|166l85|gb|MU635|CP7RSIT2 [166185] 

(View GenBank report.FASTA report.ASN.1 report, or Graphical view) 
MU636 

Bacteriophage Cp-7 (S.pneumoniae) 5' inverted terminal repeat 
gi|l 66 184|gb|Ml 1636|CP7RSIT1 [166184] 

(View GenBank report,FASTA report^SN.l report, or Graphical view) 

SEG_CP5RSrr 
Bacteriophage Cp-5 (S.pneumoniae), 5' inverted terminal repeat 
gi|l66l81|gb[!SEG_CP5RSIT [166181] 

(View GenBank report.FASTA report,ASN. 1 report, Graphical view, or 1 MEDLINE link ) 
M11633 

Bacteriophage Cp-5 (S.pneumoniae) 3* inverted terminal repeat 
gi|166180|gb|MU633|CP5RSrn [166180] 

(View GenBank report,FASTA report,ASN.l report, or Graphical view) 
MU634 

Bacteriophage Cp-5 (S.pneumoniae), 5* inverted terminal repeat 
gi|166179|gb|Ml 1634|CP5RSIT1 [166179] 

(View GenBank report,FASTA report^SN.l report, or Graphical view) 
M34780 

Bacteriophage Cp-9 muramidase (cpl9) gene 
gi|166187|gb|M34780|CP9CPL [166187] 

(View GenBank reportJASTA report^\SN.l report,Graphical view.l MEDLINE link, 1 protein link, or 1 nucleotide neighbor ) 
M34652 

Bacteriophage HB-3 amidase (hbl) gene, complete cds 
gi|215055|gb|M34652|HB3HBLA [215055] 

(View GenBank report,FASTA report, ASN.l report,Graphical view,! MEDLINE link, or 1 protein link ) 
U64984 

Streptococcus pyogenes phage T12 repressor, excisionase (xis), integrase(int) and erythrogenic toxin A precursor (speA) genes, 
complete cds gi|1877426|gb|U404531SPU40453 [1877426] 

(View GenBank report,FASTA report^SN.l report,Graphical view,2 MEDLINE links, 4 protein links, or 22 nucleotide neighbors ) 
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XI 2375 

Phage CP-T1 (Vibrio cholerae) DNA for packaging signal (pac site) 
gi|15435|emb|XI2375|NCCPPAC [15435] 

(View GenBank report,FASTA rcport t ASN.l report,Graphical view.l MEDLINE link, or 1 protein link ) 
AF087814 

Vibrio cholerae filamentous bacteriophage fs-2 DNA, complete genome sequence 
gi|3702207|dbj|AB002632|AB002632 [3702207] 

(View GenBank report,FASTA report,ASN.l report,Graphical view,l MEDLINE link, 9 protein links, or 1 genome link ) 
D83518 

Bacteriophage KVP40 gene for major capsid protein precursor, complete cds 
gi|3046858|dbj|D835 18|D835 1 8 [3046858] 

(View GenBank report,FASTA repon,ASN.l report,Graphicai view,l MEDLINE link, or 1 protein link ) 
AF033322 

Bacteriophage PST single-stranded binding protein (gene 32) gene, partial cds, and 5' region 
gi|2645774|gb|AF033322|AF033322 [2645774] 

(View GenBank report,FASTA report^SN. 1 report,Graphical view,l protein link, or 17 nucleotide neighbors ) 
X94331 

Bacteriophage L cro, 24, c2, and cl genes 
gi|1469213|emb|X94331|BLCR024C [1469213] 

(View GenBank report,FASTA repooASN.l rcport,Graphical view, I MEDLINE link, or 4 protein links ) 
U82619 

Shigella flexneri bacteriophage V glucosyl transferase (gtr), integrase (int) and excisionase (xis) genes, complete cds 
gi'2465470|gb|U826 19|SFU826 19 [2465470] 

(View GenBank report,FASTA report^SN.l report,Graphical view,! MEDLINE link, 8 protein links, or 1 nucleotide neighbo 



WO 00/32825 



PCT/IB99/02040 



246 
table 12 

NCBI Entrez Nucleotide QUERY 
Key words; bacteriophage and lysis 
56 citations found (all selected) 



AJ01158I 

Bacteriophage PS1 19 lysis genes 13, 19, 15, and packaging gene 3, 
complete cds 

giB6760841emblAJ01I581IBPS011581 [3676084] 

(View GenBank reportJFASTA reporUASN.l report,Graphicai vie\v,4 protein 
links, or 1 nucleotide neighbor ) 

AJ011580 

Bacteriophage PS34 lysis genes 13, 19, 15, ami terminator gene 23, and 
packaging gene 3. complete cds 
gil3676078lemHAJ011580IBPS01 1580 [3676078] 

(View GenBank report,FASTA rcportASN.l reportGraphical view.5 protein 
links, or 2 nucleotide neighbors ) 



AJ011579 

Bacteriophage PS3 lysis genes 13, 19, 15, and packaging gene 3 
gil3676073lemblAJ01 1579IBPS01 1579 P 676073] 

(View GenBank report,FASTA reportASN.l reportGraphicai view t 4 protein 
links, or 1 nucleotide neighbor ) 



AFQ34975 

Bacteriophage H-19B essential recombination function protein (erf). Idl 
protein (kil), regulatory protein cHI (cm), protein gpl7 (17), N 
protein (N) t cl protein (cl), cro protein (cro), ell protein (ell), O 
protein (O), P protein (P), ren protein (ren), Roi (roi), Q protein (Q), 
Shiga-like toxin A (slt-IA) and B (slt-IB) subunits. and putative holin 
protein (S) genes, complete cds 
gil2668751lgWAP03497S [2668751] 

(View GenBank report,FASTA reportASN.l reportGraphical view,l MEDLINE 
link. 20 protein links, or 30 nucleotide neighbors ) 



U37314 

Bacateriopbage lambda Rzl protein precursor (Rzl) gene, complete cds 
gi!1017780lgWU37314IBLU37314 [1017780] 

(View GenBank reportJFASTA reportASN.l reportGraphicai view,2 MEDLINE 
links, 1 protein link, or 9 nucleotide neighbors ) 



U00005 

E coli hflA locus encoding the hflX, hflK and hflC genes, hfq gene, 
complete cds; miaA gene, partial cds 
gi!436153lgbiU000051ECOHFLA [436153] 

(View GenBank reportFASTA reportASN.l reportGraphical view f 4 MEDLINE 
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links, 5 protein links, or 8 nucleotide neighbors^ 



U32222 

Bacteriophage 186, complete sequence 
gi!3337249lgblU32222JB!U32222 [3337249] 

(View GenBank reportJASTA report^ASN.l repon.Graphicai view,6 MEDLINE 
links, 46 protein links, or 5 nucleotide neighbors ) 



AF064539 

Bacteriophage N15, complete genome 
gil3192683lgblAF064539IAF064539 P 1926S3] 

(View GenBank report.FASTA report.ASN.1 report f Graphical view,2 MEDLINE 
links, 60 protein links, 26 nucleotide neighbors, or 1 genome link ) 



AF063097 

Bacteriophage P2, complete genome 
gil3139086lgblAF063097IAP063097 [3139086] 

(View GenBank report,FASTA report>ASN.l report,GraphicaI viewjl MEDLINE 
links, 42 protein links, 3 nucleotide neighbors, or 1 genome link ) 



Z97974 

Bacteriophage phiadh lys, hoi, intG, rad,and tec genes 
gi!2707950lemblZ97974IBPHIADH [2707950] 

(View GenBank report JASTA repor^ASN.I report,Graphical view,2 MEDLINE 
links, 9 protein links, or 1 nucleotide neighbor ) 



AF059243 

Bacteriophage NL95, complete genome 
gi!30885451gbIAF0592431AF059243 [3088545] 

(View GenBank report.FASTA reportASN.l report,Graphical view,2 MEDLINE 
links, 4 protein links, 3 nucleotide neighbors, or I genome link ) 



AF052431 

Bacteriophage MI 1 A-protein, coat protein, Al-protein, and replicase 
genes, complete cds 
gil2981208lgWAF052431l [2981208] 

(View GenBank report,FASTA reporUASN.l report,Graphical view,2 MEDLINE 
links, 4 protein links, or 8 nucleotide neighbors ) 



Y07739 

Staphylococcus phage Twort hoITW, plyTW genes 
gil2764979iemblY07739IBPTWGHOLG [2764979] 
(View GenBank reporuFASTA report,ASN.l report.Graphical view, or 2 
protein links ) 



X94331 
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Bacteriophage L cro, 24, c2, and ci genes 
gill4692l3lemblX94331IBLCR024C [1469213] 

(View GenBank report J-ASTA repoi^ASN.i report,Graphical view, I MEDLINE 
link, or 4 protein links) 



X78410 

Bacteriophage phiadh hoi in and lysin genes 
gir7938481embIX784iaLGHOLLYS [793848] 

(View GenBank report,FASTA reporvASN.l report,GraphicaUiew f 1 MEDLINE 
link, 2 protein links, or 1 nucleotide neighbor ) 



X99260 

Bacteriophage B 103 genomic sequence 
giI1429229iemblX99260IBB103G [1429229] 

(View GenBank report,FASTA report,ASN.l report.Graphical view.l MEDLINE 
link, 17 protein links, or 12 nucleotide neighbors ) 



AJ000741 

Bacteriophage PI darA operon 
gil2462938lemHAJ(XXr741IBPAJ7641 [2462938] 

(View GenBank report,FASTA reporuASN.l report, Graphical view,l MEDLINE 
link^ 10 protein links, or 3 1 nucleotide neighbors ) 



XS7420 

Bacteriophage ES 18 genes 24, c2, cro, cl, 18, and oL and oR operators 
gil 1 143407iembiX87420IBPES 18GEN [1 143407] 

(View GenBank report,FASTA reporUASN.l report.Graphical view,5 protein 
links, or 9 nucleotide neighbors ) 



L35561 

Bacteriophage phi-105 ORFs 1-3 
gil532218lgbll35561IPH50RFHrR [532218] 

(View GenBank reportJASTA reporUASN.l report,Graphical view.l MEDLINE 
link, or 3 protein links ) 



D10027 

Group II RNA coliphage GA genome 
gil2I7784JdbjlDl0027!PGAXX [217784] 

(View GenBank report,FASTA reportASN.l report,Graphical view, I MEDLINE 
link, 3 protein links, 5 nucleotide neighbors, or 1 genome link ) 



V0U28 

Bacteriophage phi-X174 (cs70 mutation) complete genome 
gill5535lemblV01I28IPHIX174 [15535] 

(View GenBank report.FASTA reportASN.l report,Graphical view,4 MEDLINE 
links, 1 1 protein links, or 26 nucleotide neighbors ) 
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S81763 

coat gene...replicase gene [bacteriophage KU1, host=Escherichia coli, 
group D RNA phage. Genomic RNA, 3 genes, 120 nt] 
gill438766lgblS81763!S81763 [1438766] 

(View GenBank repon.FASTA report.ASN.1 report.Graphical view, or i 
MEDLINE link ) 



U38906 

Bacteriophage rlt integrate, repressor protein (rro), dTJTPase.lioIin and 
lysin genes, complete cds 
giI13535l71gbIU38906IBRU38906 [1353517] 

(View GenBank report,FASTA reporuASN.l report,Graphical vieu\2 MEDLINE 
links. 50 protein links, or 3 nucleotide neighbors ) 



X91149 

Bacteriophage phi-C31 DNA cos region 
gilll07473lembfX91I49IAPHIC31C [1107473] 

(View GenBank reporUFASTA reporuASN.l report, Graphical view.l MEDLINE 
link, 6 protein links, or 1 nucleotide neighbor ) 



V00642 

phage MS2 genome 
gi!15081lemblV00642ILEMS2X [15081] 

(View GenBank reportJASTA reporV\SN.l report,Graphical view,8 MEDLINE 
links, 4 protein links, or 20 nucleotide neighbors ) 



V01 146 

Genome of bacteriophage T7 
gil431187lemb!V0M46iT7CG [431187] 

(View GenBank report JASTA repooASN.l report t Graphical view,13 MEDLINE 
links, 60 protein links, 105 nucleotide neighbors, or 1 genome link ) 



X78401 

Bacteriophage P22 right operon, orf 48, replication genes 18 and 12, nin 
region genes, ninG phosphatase, late control gene 23, orf 60, complete- 
cds, late control region, start of lysis gene 13 
gil512343lembiX78401IPOP22NIN [512343] ' 

(View GenBank reportFASTA reportASN.l reporuGraphical view,2 MEDLINE 
links, 13 protein links, or 4 nucleotide neighbors ) 



Y00408 

Bacteriophage T4 gene t for lysis protein 
gil 15368lemWY00408IMYT4T [15368] 

(View GenBank reporuFASTA report.ASN.1 report,Graphical view,l MEDLINE 
link, 1 protein link, or 3 nucleotide neighbors ) 



Z26590 
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Bacteriophage mv4 tysA and lysB genes 
gil410500lemblZ26590!MV4LYSAB [410500] 

(View GenBank report.FASTA report^ASN.l report.Graphical view, or 4 
protein links ) 



X07809 

Phage phiX!74 lysis (E) gene upstream region 
gill5094lemblX07809IMIPHlXE [15094] 

(View GenBank report.FASTA reportASN.l report.Graphical view.l MEDLINE 
link, 2 protein links, or 4 nucleotide neighbors ) 



Z34528 

Lactococcal bacteriophage c2 lysin gene 
gil506455lembl234528ILBC2LYSIN [506455] 

(View GenBank report,FASTA repoi^ASN.l report.Graphical view,l MEDLINE 
link, 1 protein link, or 4 nucleotide neighbors ) 



X15031 

Bacteriophage fr RNA genome 
gill5071lemWX15G3IILEBFRX [15071] 

(View GenBank report.FASTA reportASN.l repoitGraphical view.l MEDLINE 
link, 4 protein links, 9 nucleotide neighbors, or 1 genome link ) 



X80191 

Bacteriophage PP7 mRNA for maturation, coat, lysis and replicase 
proteins 

gil517237lemblX80191IBPP7PR [517237] 

(View GenBank report.FASTA reportASN.l report.Graphical view,l MEDLINE 
link, 4 protein links, or 1 genome link ) 



X85010 

Bacteriophage A511 ply5U gene 
gil853748IemblX85010IBPA511PLY [853748] 

(View GenBank reportJASTA reporCASN.l rcport,Graphical view.l MEDLINE 
link, 3 protein links, or 1 nucleotide neighbor ) 



X85009 

Bacteriophage A500 hol500 and ply500 genes 
gil853744lemWX85009IBPA500PLY [853744] 

(View GenBank report,FASTA report^ASN.l report,Graphical view.l MEDLINE 
link, 3 protein links, or 4 nucleotide neighbors ) 



X85008 

Bacteriophage A 1 18 hoi 1 18 and plyl 18 genes 
giI853740iembIX85008JBPAI18PLY [853740] 

(View GenBank report,FASTA repoiXASN.l report,Graphical view.l MEDLINE 
link, 3 protein links, or 1 nucleotide neighbor ) 
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Bacteriophage phi-X174 genes for lysis protein and beta-lactamase 
gil520996lemblZ35638IBPLYSPR [520996] 

(View GenBank report.FASTA report,ASN.l repoitGrapbicaJ view,l MEDLINE 
link. 2 protein links, or 516 nucleotide neighbors ) 



J02459 

Bacteriophage lambda, complete genome 
gil2151041gblJ02459ILAMCG [215104] 

(View GenBank report.FASTA report.ASN.l report.Graphical view t 87 MEDLINE 
links, 67 protein links, 190 nucleotide neighbors, or 1 genome link ) 



X87674 

Bacteriophage PI lydA & lydB genes 
gil974763lemblX87674IBACPlLYD [974763] 

(View GenBank report,FASTA reporiASN.l report,Graphical view.l MEDLINE 
link, 2 protein links, or 2 nucleotide neighbors ) 



X87673 

Bacteriophage PI gene 17 
gil97476ilemblX87673IBACP117 [974761] 

(View GenBank report,FASTA reportj\SN.l report.Graphical view.l MEDLINE 
link, 1 protein link, or I nucleotide neighbor ) 



M14784 

Bacteriophage T3 strain amNG220B right end, tail fiber protein, lysis 
protein and DNA packaging proteins, complete cds 
gil2158iagWM14784IFT3RE [215810] 

(View GenBank report,FASTA repoi^ASN.l report,GraphicaI view,l MEDLINE 
link, 9 protein links, or 10 nucleotide neighbors ) 



Ml 1813 

Bacteriophage PZA (from B.subtilis), complete genome 
gi!2160461gblM11813IPZACG [216046] 

(View GenBank repor^FASTA report^SN.l report,Graphical view,3 MEDLINE 
links, 27 protein links, 17 nucleotide neighbors, or 1 genome link ) 



M16812 

Bacteriophage K3 V lysis gene, complete cds 
giI215503lgblMI6812IPK3LYST [215503] 

(View GenBank report.FASTA report^SN.l report,Graphical view.l MEDLINE 
link, 1 protein link, or 4 nucleotide neighbors ) 



J04356 

Bacteriophage P22 proteins 15 (complete cds), and 19 (3' end) genes 
gil215265IgblJ04356IP2215P [215265] 
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(View GcnBank rcport^fASTA report.ASN.1 report.Graphical view.l MEDLINE 
link, 3 protein links, or 2 nucleotide neighbors ) 

J04343 

Bacteriophage JP34 coat and lysis protein genes, complete cds, and 
replicase protein gene, 5' end 
gi)215076lgblJ04343IJP3COLY [215076] 

(View GenBank report.FASTA report.ASN.1 report,Grapbica] view. 1 MEDLINE 
link, 3 protein links, or 2 nucleotide neighbors ) 

JQ2482 

- Bacteriophage phi -XI 74, complete genome 
gil216019lgbLJ02482iPXlCG [216019] 

(View GenBank report,FASTA report.ASN.1 report,Graphicai view.23 MEDLINE 
links, 1 1 protein links. 26 nucleotide neighbors, or 1 genome link ) 

M99441 

Bacteriophage T4 anri-sigma 70 protein (asiA) gene, complete cds and 
lysis protein, 3* end 

gi!215820lgblM9944ilPT4ASIA [215820] 

f^!! w i 5cnB ^ lkrc P ort ' FASTA report^SN.l reportGraphicaJ view.3 MEDLINE 
links, 2 protein links, or 2 nucleotide neighbors ) 

M65239 

Bacteriophage 2 1 lysis genes S, R. and R2, complete cds 
gi!215466lgbiM65239IPH2LYSGEN [215466] 

(View GenBank report.FASTA reportASN.l report,GraphicaI view, 1 MEDLINE 
link, 3 protein links, or 1 nucleotide neighbor ) 

M10637 

Phage G4 D/E overlapping gene system, encoding D (morphogenctic) and E 
(lysis) proteins 

gi!2154271gblM106371PG4DE [215427] 

(View GenBank report.FASTA reporuASN.l reportGraphical view.l MEDLINE 
link, 2 protein links, or 12 nucleotide neighbors ) 

J02454 

Bacteriophage G4, complete genome 
gil215415lgblJ02454JPG4CG [215415] 

(View GenBank reporuFASTA reporuASN.l report.Graphical view,6 MEDLINE 
links, 1 1 protein links, 20 nucleotide neighbors, or 1 genome link ) 

J02580 

Bacteriophage PA-2 (Ecoli porcine strain isolate) Rz gene, 5*end; ORF2, 
outer membrane porin protein (1c) and ORF1 genes, complete cds 
gil215366igWJ02580IPA2LC [215366] 

(View GenBank reportf-ASTA reporiASN.l report,Graphical view.l MEDLINE 
link, 4 protein links, or 4 nucleotide neighbors ) 



WO 00/32825 
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M 14782 

Bacillus phage phi -29 head morphogenesis, major head protein, head fiber 

protein, tail protein, upper collar protein, lower collar protein, 

pre-neck appendage protein, morphogenesis(13), lysis, morphogenesis(15), 

encapstdation genes, complete cds 

gil215323lgblM14782IP29LATE2 [215323] 

(View GenBank report,FASTA report.ASN.1 report,Graphical view, I MEDLINE 
link, 11 protein links, or 1 1 nucleotide neighbors ) 



Ml 0997 

Bacteriophage P22 lysis genes 13 and 19, complete cds 
gil2I5262lgblMl0997IP221319 [215262] 

(View GenBank report,FASTA report^ASN.l report.Graphicai view,l MEDLINE 
link. 2 protein links, or 3 nucleotide neighbors ) 



J02467 

Bacteriophage MS2, complete genome 
gi!215232lgblJ024671MS2CG [215232] 

(View GenBank report,FASTA reporuASN.l report.GraphicaJ view,8 MEDLINE 
links, 4 protein links, 20 nucleotide neighbors, or 1 genome link ) 



M 14035 

Bacteriophage lambda lysis S gene with mutations leading to nonlethality 
of S in the ptasmid pRGl 
gi!215180lgblM1403SLAMLYS [215180] 

(View GenBank report,FASTA reportj\SN.l report,Graphical view.l MEDLINE 
link, 1 protein link, or 14 nucleotide neighbors ) 



U04309 

Bacteriophage phi-LC3 putative hoi in QysA) gene and putative murein 
hydrolase (lysB) gene, complete cds 
gi!530796lgblU04309!BPU04309 [530796] 

(View GenBank report,FASTA report^ASN.l report,Graphical view,l MEDLINE 
link, 2 protein links, or 1 nucleotide neighbor ) 



WO 00/32825 



PCT/IB99/02040 
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Table 13 



NCBI Entrez Nucleotide QUERY 

Key word: holin 

51 citations found (all selected) 



AF034975 

Bacteriophage H-19B essential recombination function protein (erf), kil 
protein (kil), regulatory protein cIII (cIII), protein gpl7 (17), N 
protein (N) t cl protein (cl), cro protein (cro), ell protein (ell), O 
protein (O), P protein (P), ren protein (ren), Roi (roi), Q protein (Q), 
Shiga-like toxin A (slt-IA) and B (slt-IB) subunits, and putative holin 
protein (S) genes, complete cds 
gil2668751lgb!AF034975l [2668751] 

(View GenBank report,FASTA reporuASN.l report,Graphical view.l MEDLINE 
link, 20 protein links, or 30 nucleotide neighbors ) 

U52961 

Staphylococcus aureus holin-like protein LrgA (lrgA) and LrgB (IrgB) 
genes, complete cds 

gill841516lgblU529611SAU52961 [1841516] 

(View GenBank report,FASTA report,ASN.l report.Graphical view,l MEDLINE 
link, 2 protein links, or 1 nucleotide neighbor ) 

U28154 

Haemophilus somnus cryptic prophage genes, capsid scaffolding protein 
gene, partial cds, major capsid protein precursor, endonuclease, capsid 
completion protein, tail synthesis proteins, holin, and lysozyme genes, 
complete cds 

gil!765928igblU28154IHSU28154 [1765928] 

(View GenBank reportJFASTA report^SN.l report.Graphical view,l MEDLINE 
link, or 13 protein links ) 

AF032122 

Streptococcus thermophilus bacteriophage Sfil9 central region of genome 
gi!2935682lgblAF032122l [2935682] 

(View GenBank report,FASTA report^SN.l report.Graphical view,l MEDLINE 
link, 14 protein links, or 2 nucleotide neighbors ) 

AF032121 

Streptococcus thermophilus bacteriophage Sfi21 central region of genome 
gil2935667lgblAF032121lAF032121 [2935667] 

(View GenBank report,FASTA report ASN.l report,Graphical view,l MEDLINE- 
link, 14 protein links, or 2 nucleotide neighbors ) 



WO 00/32825 



PCT/IB99/02040 
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AF021803 

Bacillus subtilis 168 prophage SPbeta N-acetylmuramoyl-L-alanineamidase 
(blyA), holin-like protein (bhlA), holin-like protein (bhJB), and yoIK 
genes, complete cds; and yolJ gene, partial cds 
gil2997594lgb!AF0218Q3!AF021803 [2997594] 

(View GenBank report.FASTA report,ASN.i report,GraphicaI view,l MEDLINE 
link, 5 protein links, or 1 nucleotide neighbor ) 



AF057033 

Streptococcus thermophilus bacteriophage sfill gp502 (orf502) t gp284 
(orf284), gpl29 (orf!29), gp!93 (orf 193), gpll9(orf 119), gp348 
(orf348), gp53 (orf53), gpll3 (orfll3), gp 104 (orf 104), gpll4(orfll4), 
gpl28 (orf 128), gpl68 (orf 168), gpl 17 (orf 117). gpl05 (orf 105), putative 
minor tail protein (orf 1510), putative minor structural protein 
(orf512), putative minor structural protein (orf 1000), gp373 (orf373), 
gp57 (orf57), putative anti-receptor (orf695), putative minor structural 
protein (orf669), gp!49 (orf 149), putative holin (orf 141), putative 
holin (orf87), and lysin (orf288) genes, complete cds 
gi!3320432lgblAF057033lAF057033 [3320432] 

(View GenBank reportJFASTA report ASN.l report,Graphical view,25 protein 
links, or 1 nucleotide neighbor ) 



U32222 

Bacteriophage 186, complete sequence 
gil33372491gblU32222IBlU32222 [3337249] 

(View GenBank report,FASTA report^ASN.l report,Graphical view,6 MEDLINE 
links, 46 protein links, or 5 nucleotide neighbors ) 



AB009866 

Bacteriophage phi PVL proviral DNA, complete sequence 
gi!3341907ldbjlAB009866IAB009866 [3341907] 

(View GenBank report,FASTA reporUASN.l report,Graphical view ,63 protein 
links, or 1 nucleotide neighbor ) 



AF009630 

Bacteriophage WL170, complete genome 
gil328226OigblAF0O96301AFOO9630 [3282260] 

(View GenBank report,FASTA report ASN.l report,GraphicaI view,63 protein 
links, 3 nucleotide neighbors, or 1 genome link ) 



AF064539 

Bacteriophage N15, complete genome 



WO 00/32825 
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gi!3192683lgblAF0645391AF064539 [3192683] 

(View GenBank report.FASTA reportASN.l report,Graphical vtew,2 MEDLINE 
links, 60 protein links, 26 nucleotide neighbors, or i genome link ) 



AF063097 

Bacteriophage P2, complete genome 

gi!3 1390861gblAF063097IAF063097 [3 139086] 

(View GenBank report,FASTA report^SN.l report',Graphical view,21 MEDLINE 
links, 42 protein links, 3 nucleotide neighbors, or 1 genome link ) 



Z97974 

Bacteriophage phiadh lys, hoi, intG, rad,and tec genes 
gii2707950lemb!Z97974IBPHIADH [2707950] 

(View GenBank report,FASTA report^SN.l report.Graphical view,2 MEDLINE 
links, 9 protein links, or 1 nucleotide neighbor ) 



X95646 

Streptococcus thermophilus bacteriophage Sfi21 DNA; lysogeny module, 
8141 bp 

gil22927471emblX956461BSFI21LYS [2292747] 

(View GenBank report JFASTA report^SN.l report,Graphical view,2 MEDLINE 
links, 19 protein links, or 3 nucleotide neighbors ) 



SEG_LLHLYSINO 

Bacteriophage LL-H structural protein gene, partial cds; minor 
structural protein gp61 (g57), unknown protein, unknown protein, 
structural protein (g20), unknown protein, unknown protein, major capsid 
protein (g34), main tail protein gpl9 (gl7), holin (hoi), muramidase 
(mur), unknown protein, unknown protein, unknown protein, unknown 
protein, unknown protein, and unknown protein genes, complete cds; 
unknown protein gene, partial cds; and unknown protein, unknown protein, 
unknown protein, unknown protein, unknown protein, minor structural 
protein gp75 (g70), minor structural protein gp89 (g88), minor 
structural protein gp58 (g71), unknown protein, unknown protein, unknown 
protein, and unknown protein genes, complete cds 
gill0O4337lgbIISEG_LLHLYSIN0 [1004337] 

(View GenBank report,FASTA report ASN.l report,Graphical view,4 MEDLINE 
links, 31 protein links, or 1 nucleotide neighbor ) 



M96254 

Bacteriophage LL-H holin (hoi), muramidase (mur), and unknown protein 
genes, complete cds 

gii 1004336! gb!M96254ILLHLYSIN03 [1004336] 

(View GenBank report,FASTA report,ASN.l report, or Graphical view) 
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Y07740 

Staphylococcus phage 187 ply 187 and hoi 187 genes 
gil2764982lemblY07740IBP187PLYH [2764982] 
(View GenBank report,FASTA report.ASN.1 report,Graphical view, or 2 
protein links ) 



U88974 

Streptococcus thermophilus bacteriophage 01205 DNA sequence 
gil2444080lgbIU88974l [2444080] 

(View GenBank report,FASTA report^SN.l report.Graphical view,l MEDLINE 
link, 57 protein links, or 6 nucleotide neighbors ) 



Z99117 

Bacillus subtilis complete genome (section 14 of 21): from 2599451 to 
2812870 

gil2634966lemblZ991171BSUB0014 [2634966] 

(View GenBank report,FASTA report^ASN.l report,Graphical view,233 
protein links, 51 nucleotide neighbors, or 1 genome link ) 



Z99115 

Bacillus subtilis complete genome (section 12 of 21): from 2195541 to 
2409220 

gil2634478jemblZ99115IBSUB0012 [2634478] 

(View GenBank report,FASTA repor^ASN.l report,Graphical view,244 
protein links, 64 nucleotide neighbors, or 1 genome link ) 



Z99110 

Bacillus subtilis complete genome (section 7 of 21): from 1 194391 to 
1411140 

gil26334721emblZ99110IBSUB00O7 [2633472] 

(View GenBank report,FASTA report,ASN.l report,Graphical view,226 
protein links, 3 1 nucleotide neighbors, or 1 genome link ) 



X78410 

Bacteriophage phiadh holin and lysin genes 
gil793848lemblX78410ILGHOLLYS [793848] 

(View GenBank reportfASTA report,ASN.l report.Graphical view.l MEDLINE 
link, 2 protein links, or 1 nucleotide neighbor ) 



Z93946 



WO 00/32825 



PCT/IB99/02040 
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Bacteriophage Dp-1 dph and pal genes and 5 open reading frames 
gill934760lemblZ93946IBPDPlORFS (1934760] 
(View GenBank report.FASTA report ASN.l report,Graphical view, or 6 
protein links ) 



AF011378 

Bacteriophage ski complete genome 
gi!2392824lgblAF01 1378IAF01 1378 [2392824] 

(View GenBank report.FASTA reportASN.l report,Graphical view, 54 protein 
links, 2 nucleotide neighbors, or 1 genome link ) 



Z47794 

Bacteriophage Cp-1 DNA, complete genome 
giI2288892lemblZ477941BPCPlXX [2288892] 

(View GenBank repoit,FASTA report^SN.i report,Graphical view3 MEDLINE 
links, 28 protein links, 1 nucleotide neighbor, or 1 genome link ) 



L35561 

Bacteriophage phi-105 ORFs 1-3 
gil532218lgb!L35561lPH50RFHTR [532218] 

(View GenBank report,FASTA report ASN.i report,Graphical view,l MEDLINE 
link, or 3 protein links ) 



D49712 

Bacillus Hcheniformis DNA for ORFs, xpaL2 homologous protein and xpaLl 
homologous protein, complete and partial cds 
gi!1514423ldbjlD49712ID49712 [1514423] 

(View GenBank report.FASTA report,ASN.l report,Graphical view,2 MEDLINE 
links, or 4 protein links ) 



X90511 

Lactobacillus bacteriophage phigle DNA forRorf 162, Holin, Lysin, and 
Rorf 175 genes 

gi!1926386lemblX9051 1ILBPHIHOL [1926386] 

(View GenBank report,FASTA reportASN.l report,Graphical view ,4 protein 
links, or 1 nucleotide neighbor ) 



X98106 

Lactobacillus bacteriophage phigle complete genomic DNA 
gi!1926320lemblX98106ILBPHIGlE [1926320] 

(View GenBank report,FASTA report,ASN.l report,Graphical view,l MEDLINE 
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link, 50 protein links, or 4 nucleotide neighbors ) 



U72397 

Bacteriophage 80 alpha holin and amidase genes, complete cds 
giil76324ilgbiU723971B8U72397 [1763241] 

(View GenBank reportFASTA report,ASN.l report,Graphical view,2 protein 
links, or 2 nucleotide neighbors ) 



U38906 

Bacteriophage rlt integrase, repressor protein (rro), dUTPase, holin and 
lysin genes, complete cds 
gill353517lgb!U38906IBRU38906 [1353517] 

(View GenBank report JFASTA report ,ASN.l report t Graphical view,2 MEDLINE 
links, 50 protein links, or 3 nucleotide neighbors ) 



X91149 

Bacteriophage phi-C3 1 DNA cos region 
gi!1107473IemblX91149IAPfflC31C [1107473] 

(View GenBank reportJFASTA report,ASN.l report,Graphical view,l MEDLINE 
link, 6 protein links, or 1 nucleodde neighbor ) 



U24159 

Bacteriophage HP1 strain HPlcl, complete genome 
gill046235IgblU24159IBHU24159 [1046235] 

(View GenBank report,FASTA report ASN.l reporttGraphical view t 6 MEDLINE 
links, 41 protein links, 8 nucleodde neighbors, or 1 genome link ) 



Z26590 

Bacteriophage mv4 lysA and iysB genes 
gi!410500lemblZ26590iMV4LYSAB [410500] 

(View GenBank reportJFASTA report ASN.l report,Graphical view, or 4 
protein links ) 



Z70177 

B.subtilis DNA (28 kb PBSX/skin element region) 
gill225934lembiZ701771BSPBSXSE [1225934] 

(View GenBank report,FASTA report ASN.l report,Graphical view\32 protein 
links, or 4 nucleotide neighbors ) 



Z36941 



WO 00/32825 



PCT/IB99/02040 



B.subtilis defective prophage PBSX xhlA, xhlB, and xylA genes 
gil535793iemblZ36941IBSPBSXXHL [535793] 

(View GenBank report.FASTA report j\SN.l report,Graphical view ,4 protein 
links, or 5 nucleotide neighbors ) 



X89234 

i 

Unnocua DNA for phagelysin and holin gene 
gil 1 134844lemblX89234ILICPLYHOL [ 1 134844] 

(View GenBank report.FASTA report,ASN.l report.Graphical view,l MEDLINE 
link, 2 protein links, or 4 nucleotide neighbors ) 



X85010 

Bacteriophage A511 ply511 gene 
gil853748lemblX85010iBPA511PLY [853748] 

(View GenBank report.FASTA report^SN.l report,Graphical view,l MEDLINE 
link, 3 protein links, or 1 nucleotide neighbor ) 



X85009 

Bacteriophage A500 hol500 and ply500 genes 
gil853744lemblX85009IBPA500PLY [853744] 

(View GenBank report,FASTA report ,ASN A report,Graphical view,l MEDLINE 
link, 3 protein links, or 4 nucleotide neighbors ) 



X85008 

Bacteriophage Al 18 hoi 118 and ply 118 genes 
gil8537401emblX85008IBPA118PLY [853740] 

(View GenBank report,FASTA report,ASN.l report,Graphical view,l MEDLINE 
link, 3 protein links, or 1 nucleotide neighbor ) 



L34781 

Bacteriophage phi 11 holin horaologue (ORF3) gene, complete cds and 
peptidoglycan hydrolase (lytA) gene, partial cds 
gi!511838lgblU47811BPHHOUN [511838] 

(View GenBank report,FASTA reportASN.l report,Graphical view,l MEDLINE 
link, 4 protein links, or 2 nucleotide neighbors ) 

U11698 

Serratia marcescens SM6 extracellular secretory protein (nucE), putative 
phage lysozyme (nucD), and transcriptional activator (nucC) genes, 
complete cds 

gil509550lgblU11698ISMU11698 [509550] 

(View GenBank report,FASTA report ASN.l report t Graphical view,l MEDLINE 
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link, 3 protein links, or 1 nucleotide neighbor) 
U31763 

Serratia marcescens phage-holin analog protein (regA), putative pha°e 
cds° Zyme (resB ^' Md 'ranscriptional activator (regC) genes, complete 
gil965068lgblU31763ISMU31763 [965068] 

(View GenBank reportJASTA repor^ASN.l report.Graphical view.l MEDLINE 
link, 3 protein links, or 1 nucleotide neighbor ) 

X87674 

Bacteriophage PI lydA & lydB genes 
gil974763lemblX87674IBACPlLYD [974763] 

(View GenBank report,FASTA report^SN.l report,Graphical view.l MEDLINE 
link, 2 protein links, or 2 nucleotide neighbors ) 

L486Q5 x 

Bacteriophage c2 complete genome 

gill 1462761gblL486Q5IC2PVCG [1 146276] 

f- V if w ,9 enBai ! k report-FASTA report.ASN.1 report,Graphical view3 MEDLINE 
links, 39 protein links, 3 nucleotide neighbors, or 1 genome link ) 

L33769 

Bacteriophage bIL67 DNA polymerase subunit (ORF3-5), essential 
recombination protein (ORF13), lysin (ORF24), minor tail protein 
PI ?P !) * terminase subunit (ORF32), holin (ORF37), unknown protein (ORF 
1-2,6-12,14-23^5-3033-36), complete genome V 
gil522252lgb03769IL67CG [522252] 

(View GenBank reportf-ASTA report,ASN.l report,Graphical view.l MEDLINE 
link, 37 protein links, 2 nucleotide neighbors, or 1 genome link ) 

L31348 

Bacteriophage Tuc2009 integxase (int) gene, complete cds: lysin (lys) 
gene, 3' end . . ■> j 

gil508612lgblL31348mj2INT [508612] 

(View GenBank report JASTA report,ASN.l report,Graphicai view,2 MEDUNE 
links, 3 protein links, or 3 nucleotide neighbors ) 

L31364 

Bacteriophage Tuc2009 holin (S) gene, complete cds; lysin (lys) gene, 
complete cds 3 6 

giW96281lgblL3 1364ITU2SLYS [496281] 
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(View GenBank report,FASTA reportASN.l report,Graphical view,l MEDLINE 
link, 2 protein links, or 1 nucleotide neighbor) 

L31366 

Bacteriophage Tuc2009 structural protein (mp2) gene, complete cds 
gil496278lgbIL31366ITU2MP2A [496278] 

(View GenBank report,FASTA report^SN.l reporJ,Graphical viewj MEDLINE 
link, 2 protein links, or 1 nucleotide neighbor ) 

L31365 

Bacteriophage Tuc2009 structural protein (mpl) gene, complete cds 
gil496276lgb01365TTU2MPlA [496276] P 
(View GenBank report JFASTA report.ASN.1 report,Graphicai view,l MEDLINE 
link, or 1 protein link ) 



U04309 

Bacteriophage phi-LC3 putative holin (lysA) gene and putative murein 
hydrolase (lysB) gene, complete cds 
gi!530796lgblU04309IBPU04309 [530796] 

(View GenBank report,FASTA reportASN.l report,Graphical view,I MEDLINE 
link, 2 protein links, or 1 nucleotide neighbor ) 



WO 00/32825 



PCT/IB99/02040 
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Table 14 



NCBI Entrez Nucleotide QUERY 
Key word: bacteriophage and kil 
5 citations found (all selected) 

AF034975 

Bacteriophage H-19B essential recombination function protein (erf), kil 
protein (kil), regulatory protein cIII (cIII), protein gp!7 (17), N 
protein (N), cl protein (cl), cro protein (cro), ell protein (ell), O 
protein (0), P protein (P), ren protein (ren), Roi (roi), Q protein (Q), 
Shiga-like toxin A (slMA) and B (slt-IB) subunits, and putative holin 
protein (S) genes, complete cds 
gi!2668751lgblAF034975! [2668751] 

(View GenBank report,FASTA report^SN.l report,Graphical view,! MEDLINE 
link, 20 protein links, or 30 nucleotide neighbors ) 

X15637 

Bacteriophage P22 P(L) operon encompassing ral, 17, kil and arf genes 
gill5646lemblX15637IPOP22PL [15646] 

(View GenBank report,FASTA report,ASN.l report,Graphical view,l MEDLINE 
link, 7 protein links, or 2 nucleotide neighbors ) 

J02459 

Bacteriophage lambda, complete genome 
gil215104igblJ02459ILAMCG [215104] 

(View GenBank report,FASTA report^ASN.l report,Graphical view,87 MEDLINE 
links, 67 protein links, 190 nucleotide neighbors, or 1 genome link ) 

M64097 

Bacteriophage Mu left end 
gil215543IgblM64097IPMULEFTEN (215543] 

(View GenBank report,FASTA report,ASN.l report,Graphical view,2 MEDLINE 
links, 39 protein links, or 15 nucleotide neighbors ) 

M18902 

Bacteriophage D108 kil gene encoding a replication protein, 3' end; and 
containing three ORFs, complete cds 
gili66191lgblM18902ID18KIL [166191] 

(View GenBank reportJ-ASTA report^SN.l report,Graphical view,l MEDLINE 
link, 1 protein link, or 3 nucleotide neighbors ) . 
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Table 15 



U77328 


V01282 


U11787 


U93688 


A47599 


D21131 


U76864 


U38428 


AF151117 


AF121672 


U11786 


U93687 


A47598 


D30690 


U76863 


U66665 


AF151218 


AF072726 


U11785 


AJ224764 


A47597 


D14711 


U76862 


U66664 


AF146368 


AF1 15379 


U11784 


AF064774 


A47596 


D90119 


U76861 


U66663 


AF144661 


AF034153 


U11783 


AF064773 


A47595 


D00730 


U76860 


X87104 


AF132117 


AF029244 


U11782 


Y14370 


A47594 


D83357 


U76859 


X87105 


Y 15477 


U67965 


U11781 


AF065394 


A44534 


D83356 


U76858 


X89233 


Y09928 


U96610 


U11780 


AF062376 


A44533 


D83355 


U76857 


M28521 


Y09594 


U96609 


U11779 


AF062375 


A44529 


D83354 


U76855 


U54636 


AF1 34905 


U73027 


U11778 


AF062374 


A44528 


D83353 


U76854 


U46541 


ABO 19536 


U73026 


U11777 


AF062373 


A44527 


D 12572 


U76853 


L14017 


AJ237696 


U73025 


U11776 


AB007500 


A44526 


D86727 


U76852 


U60589 


AF106851 


AF068904 


U11775 


Y09924 


A44525 


D86240 


U76851 


X48003 


AF1 06850 


U60050 


U11774 


U63529 


A39696 


D67075 


U76850 


M37889 


AF1 06849 


D10907 


U11773 


AF033191 


AF001783 


D67074 


U76849 


V01281 


M26321 


D 10906 


AF053772 


Y15856 


AF001782 


U97062 


U76848 


X97985 


AF060191 


AF053140 


AF053771 


AB000439 


L77194 


U96620 


U76847 


X00127 


AF060190 


AB013298 


AF029731 


AF041467 


AF003593 


U96619 


Y09929 


X03286 


AF060189 


Y16431 


AF027155 


Y14051 


AF003592 


Z84573 


Y09570 


X62282 


AF060188 


AF076684 


AF024571 


U82085 


X73889 


AB001896 


X95848 


X01645 


AF060187 


AF076683 


U87144 


AF026122 


X74219 


Y07645 


Y09428 


X16471 


AF060186 


Y13225 


AF086644 


AF026121 


YI0419 


U92441 


S76611 


X52734 


AF060185 


AF094826 


AJ223781 


AF026120 


M63177 


U91741 


S76213 


X13290 


AF060184 


AJ223480 


AF076030 


AB009635 


E08773 


U29454 


S75707 


X66088 


AF036324 


AF093548 


AF044951 


AB006796 


E07163 


U29478 


S75706 


Z30588 


AF036323 


AJ005352 


AF044906 


U39769 


E07162 


U77374 


S75705 


X16457 


AF053568 


AF051916 


AF044905 


D00184 


E07161 


L42945 


S76270 


X00342 


AJ132841 


Y09927 


AF044904 


X56628 


E07160 


U38429 


S72497 


V01287 


Y13766 


AF051917 


AF044903 


AFO33018 


E07159 


U81980 


S72488 


X61307 


AF101234 


S77058 


AF044902 


AF034076 


E07158 


X55185 


S74031 


Y00356 


AJ133520 


S65052 


AF044901 


D82063 


E07157 


V01278 


S67449 


X06603 


AJ133495 


AF009671 


AF044900 


D76414 


E07156 


U31979 


U75367 


Z93205 


AJ132803 


U81973 


AF044899 


U57060 


E07155 


X91786 


U75368 


X64172 


AB016487 


U77308 


AF044898 


D89066 


E03836 


U36912 


U31175 


X72700 


AB016431 


U20869 


AF044897 


U85095 


E03835 


U36911 


X53096 


X60827 


AB015981 


U89396 


AF044075 


U85097 


E03526 


U36910 


X53951 


X64389 


AB015195 


U94706 


AF044074 


U85096 


E02873 


U64885 


X53952 


X62288 


AF107307 


U41072 


AF044073 


D42078 


E01690 


U76872 


X03408 


X55798 


AF079518 


U52961 


AF044072 


AF015929 


E00876 


U76871 


U50629 


X58434 


AJ223806 


U21636 


AF044071 


D10369 


E00203 


U76870 


U38656 


X06627 


Y18018 


U65000 


AF044070 


A48955 


D83951 


U76869 


U58139 


X12831 


Y17795 


U48826 


AF044069 


A48501 


D17366 


U76868 


A3 1894 


X07371 


AJ005647 


U2O503 


AF044068 


A48500 


D42144 


U76867 


L42943 


X02529 


AJ005646 


U11789 


AF044067 


A48499 


D42143 


U76866 


U51474.„ 


Y00688 


AJ005645 


U11788 


AF044066 


A47600 


D 10489 


U76865 


U50077 — 


X04121 


X59477 


X54338 


A12915 


U51133 


M63176 


Ml 0500 


L01055 


M63917 


X59478 


X51661 


A12913 


U51132 


LI 1998 


Ml 0499 


M83994 


M58515 


X63598 


X05815 


A12906 


X02588 


L05004 


AH000934 


J03947 


L10909 


X52593 


X15574 


A12905 


X61716 


L42764 


Ml 0498 


J03479 


Ml 5067 
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X76490 


Y07536 


A I 2904 


X61719 


M32103 


Ml 0497 


M64724 


M92376 


X81586 


X02l 66 


A l 2903 


X61718 


U I 0927 


Ml 8264 


M14372 


M62650 


X72014 


Z49245 


A l 2902 


X67743 


AH003057 


JO 1786 


M14371 


M32312 


X72013 


X16298 


A12901 


X67742 


M73535 


M33833 


M 14374 


M20393 


X71437 


Z18852 


A12900 


X67741 


M73536 


M32470 


M15215 


M90536 


X62992 


X68417 


A12899 


X67740 


U20782 


M20270 


M36694 


M21854 


X52594 


X68425 


A12898 


X67738 


L37598 


J03323 


M37915 


M36771 


X14827 


X17679 


A12897 


U02910 


L37597 


M33479 


M12715 


L14020 


X13404 


X63072 


A12896 


AH003349 


L36472 


M94061 


J04151 


M81736 


X17301 


X02872 


A09523 


MI ll 18 


L25288 


M37888 


L22566 


U1I702 


XI 7688 


V01277 


A04518 


Ml 8086 


L25893 


M76714 


L13379 


L19300 


X03097 


X52543 


A04517 


U 19459 


K02687 


M17123 


L13378 


L25372 


Z16422 


A l 9943 


A04512 


U35773 


L23109 


■» m fx*** * a^£\ 

M97169 


L13377 


L22565 


Z33409 


A l 9942 


L41499 


U26702 


L07778 


M81346 


L13376 


M58516 


Z33408 


A19941 


U19770 


U21221 


M90056 


M90693 


L13375 


U06462 


Z33407 


A19940 


X53818 


U36379 


J02615 


M25257 


L13374 


L19298 


Z33406 


A19939 


M20129 


U0645I 


Ml 8970 


M25256 


M17348 


M80252 


Z33405 


A19938 


L43098 


U35036 


K02985 


M25255 


M17357 


L11530 


Z33404 


A19937 


L43082 


U20794 


M21136 


M25254 


M17347 




X75439 


A19936 


X03216 


L25426 


M10501 


M25253 


M28364 




X62587 


A17958 


X70648 


M86227 


AHOO0935 


M25252 


M21319 
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Table 16 

Phage 44 AH J D complete genome sequence. 16668 nucleotides. 

1 tccatttctt tactaaactt aaaaatgctg tgcaacaact taaccaactt atctaaccta ttacatattc 

71 atcaaataca aaatttatgt atctattgac ttttattcaa aattatgatt tcaacatata ataaaattaa 

141 tttacttatt taaatattct atgatataat tagttataaa atatttggag gtgtataaat gacagaattt 

211 gatgaaatcg taaaaccaga cgacaaagaa gaaacttcag aatcaactga agaaaattta gaatcaactg 

281 aagaaacttc agaatcaact gaagaatcaa ctgaagaatc aactgaagaa tcaactgaag ataaaacagt 

351 agaaacaatc gaagaagaaa atgaaaacaa attagaacct actacaacag atgaagatag ttcgaaattt 

421 gaccctgttg tattagaaca acgtattgct tcattagaac aacaagtgac tactttttta tcttcacaaa 

491 tgcaacaacc acaacaagta caacaaacac aatcagatgt aacagaatca aacaaagaag ataacgacta 

561 ttcagatgaa gaactagttg ataagttaga tttagattag gaggaattta aacatgtatg agggaaacaa 

€31 catgcgttct atgatgggta cat cat at ga agattcaaga ttaaataaac gaacagaatt aaatgaaaac 

701 atgtcaattg atacaaataa aagtgaagat agttatggtg tacaaattca ttcactttca aaacaatcat 

771 ttacaggtga cgttgaggag gaataataaa ttatggcaca acaatctaca aaaaatgaaa ctgcactttt 

841 agtagcaaag tcagctaaat cagcgttaca agattttaat catgattatt caaaatcttg gacatttggc 

911 gacaaatggg ataattcaaa tacaatgttc gaaacatttg taaataaata tttattccct aagattaatg 

981 agactttatt aatcgatatt gcattaggta atcgttttaa ttggttagct aaagagcaag attttattgg 

1051 acaatatagt gaagaatacg tgattatgga cacagtacca attaacatgg acttatctaa aaatgaggaa 

1121 ttaatgttga aacgtaatta tccacgtatg gcaactaagt tatatggtaa cggaattgtg aagaaacaaa 

1191 aattcacatt aaacaacaat gatacacgtt tcaatttcca aacattagca gacgcaacta attacgcttt 

1261 aggtgtatac aaaaagaaaa tttctgatat taatgtatta gaagaaaaag aaatgcgtgc aatgttagtt 

1331 gat tact cat tgaatcaatt atccgaaaca aatgtacgta aagcaacatc aaaagaagat ttagcaagca 

1401 aagtttttga agcaatccta aacttacaaa acaacagtgc taaatataat gaagtacatc gtgcatcagg 

1471 tggtgcaatt ggacaatata caactgtatc aaaattaaaa gatattgtga ttttaacaac agattcatta 

1541 aaatcttatc ttttagatac taagattgca aacacattcc agattgcagg cattgatttc acagatcacg 

1611 ttattagttt tgacgactta ggtggcgtgt ttaaagtaac aaaagaattt aagttacaaa accaagattc 

1661 aattgacttt ttacgtgcgt atggagatta tcaatcacaa ttaggagata caattccagt tggtgctgta 

1751 tttacttatg atgtatctaa acttaaagag tttactggca acgttgaaga aattaaacca aaatcagatt 

1821 tatatgcgtt tattttggat attaattcaa ttaaatataa acgttacaca aaaggtatgt taaaaccacc 

1891 attccataac cctgaatttg atgaagttac acactggatt cattactatt catttaaagc cattagtcca 

1961 ttctttaata aaattttaat tactgaccaa gatgtaaatc caaaaccaga ggaagaatta caagaataaa 

2031 aggagcgtaa aatatgaaca acgataaaag aggtttaaac gttgagttat caaaggaaat cagcaaaaga 

2101 gttgttgaac atcgcaacag atttaaacgt cttatgttta atcgttattt ggaattttta ccgctactaa 

2171 tcaactatac caatcgtgat acggttggta tagattttat tcagttagaa tcagctttaa gacaaaacat 

2241 taatgtagtt gttggtgaag ctagaaataa gcaaattatg attcttggtt atgtaaataa cacttacttt 

2311 aatcaagcac caaatttttc atcaaacttt aatttccaat ttcaaaaacg attaactaaa gaagatatat 

2381 attttattgt acctgactat ttaatacctg atgattgtct acaaattcat aagctatatg ataactgtat 

2451 gagtggtaac tttgttgtca tgcaaaataa accaattcaa tataatagtg atatagaaat tatagaacat 

2521 tatactgatg aattagcaga agttgcttta tctcgctttt ctttaatcat gcaagcaaaa tttagcaaga 

2591 tatttaaatc agaaattaat gacgagtcaa tcaatcaact tgtgtccgaa atatataacg gtgcaccatt 

2661 tgttaaaatg tcacctatgt ttaatgcaga tgacgatatc attgatttaa caagtaatag cgtaatccca 

2731 gcattaactg aaatgaaacg ggaatatcaa aacaaaatta gtgaattaag taactattta ggcattaatt 

2801 cattagccgt tgataaagaa agcggtgttt cagacgaaga ggcaaaaagt aatcgtggat ttaccacatc 

2871 aaacagtaat atctatttaa aaggtcgtga accaattacg tttttatcaa agcgttatgg tttagatatt 

2941 aaaccgtatt acgatgatga aacaacgtct aaaatatcaa tggtagacac actttttaaa gacgaaagca 

3011 gtgatataaa tggctagata cacaatgact ttatacgatt tcattaaatc agaattgatt aaaaaaggtt 

3081 tcaatgaatt tgtaaatgat aataaattaa cgttttatga tgatgaattt caattcatgc aaaaaatgct 

3151 gaagttcgac aaagacgttt tagctatcgt taatgaaaaa gtatttaaag gtttttcatt gaaagatgaa 

3221 ttatcagatt tactttttaa aaaatcattt acgattcatt ttttagatag agaaatcaac agacaaacag 

3291 ttgaagcatt tggcatgcaa gtgattactg tatgtattac acatgaggat tatttaaatg tggtttattc 

3361 atcaagtgaa gttgaaaaat acttacaatc acaaggcttc acagaacaca atgaagatac aacaagtaac 

3431 actgatgaaa catcgaatca aaatgctaca tctttagaca attcaactgg catgactgca aacagaaacg 

3501 cttatgtgtc attaccacaa agtgaggtta acattgatgt tgataataca acgttacgat tcgctgataa 

3571 taatacgatt gataacggta aaactgtgaa taaatcgagt aacgaaagta atcaaaacgc aaaacgtaat 

3641 caaaatcaaa aaggtaatgc aaaaggtaca caatt caeca agcagtattt aattgataat attgataaag 

3711 cgtacgattt aagaaagaaa attttaaatg aatttgataa aaaatgtttt ttacaaattt ggtagaggtg 

3781 gttaaataat ggcatataat gaaaacgatt ttaaatattt tgatgacatt cgtccatttt tagacgaaat 

3851 ttataaaacg agagaaegtt atacacegtt ttacgatgat agagcagatt ataatactaa ttcaaaatca 

3921 tattatgatt atatttcaag attatcaaaa ctaattgaag tattagcacg tegtatttgg gactatgaca 

3991 atgaattaaa aaaacgtttc aaaaattggg acgacttaat gaaagcattt ccagagcaag cgaaagactt 

4061 atttagaggt tggttaaacg aeggtacgat tgacagtatt attcatgacg agtttaaaaa atatagegea 

4131 ggattaacat eggcatttge tttatttaaa gttactgaaa tgaaacaaat gaatgacttt aaatcagaag 

4201 ttaaagactt aattaaagat attgacegtt tcgttaatgg gtttgaatta aatgagcttg aaccaaagttv. 

4271 tgtgatgggc tttggtggta ttcgcaacgc agttaaccaa tctattaata ttgataaaga aacaaatcac 

4341 atgtactcta cacaatccga ttctcaaaaa cctgaaggtt tttggataaa taaattaaca cctagtggtg 

4411 acttaatttc aagcatgcgt attgtacagg gtggtcatgg tacaacaatc ggattagaac gtcaatccaa 

4481 tggtgaaatg aaaatctggt tacatcacga tggtgttgca aaactgttac aagtegcata taaagataat 

4 551 tatgtattag atttagaaga ggctaaaggt ttaacagatt atacaccaca gtcactttta aacaaacaca 

4621 catttacacc gttaattgat gaagcaaatg acaaactcat tttaagattc ggtgacggaa caatacaggt 

4691 tegttcaaga geagaegtaa aaaatcacat tgataatgta gaaaaagaaa tgacaattga taattcagaa 
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4761 aacaatgata atcgttggat gcaaggcatt gctgttgatg gtgatgattt atactggtta agtggtaaca 

4 831 gttcagttaa ttcacatgtt caaatcggta aatattcatt aacaacaggt caaaagattt atgattatcc 

4901 atttaagtta tcatatcaag acggtattaa tttcccacgt gataacttta aagagcctga gggtatttgc 

4971 atttacacaa atccaaaaac aaaacgtaaa tcgttattac ttgctatgac aaacggcggt ggtggaaaac 

5041 gtttccataa tttatatggt ttcttccaac ttggtgagta tgaacacttt gaagcattac gcgcaagagg 

5111 ttcacaaaac tataaattaa caaaagacga cggtcgtgca ttatctattc cagaccatat cgacgattta 

5181 aatgacttaa cgcaagctgg tttttattat actgacgggg gtactgcaga aaaacttaag aatatgccaa 

5251 tgaatggtag caagcgtata attgacgccg gttgtttcat taatgtatac cctacaacac aaacattagg 

5321 tacggttcaa gaattaacac gtttctcaac aggtcgtaaa atggttaaaa tggtgcgtgg tatgacttta 

5391 gacgtattta cgttaaaatg ggattatgga ttatggacaa caatcaaaac tgacgcacca tatcaagaat 

54 61 atttggaagc aagtcaatac aataactgga ttgcttatgt aacaacagct ggtgagtatt acattacagg 

5531 taaccaaacg gaattattta gagacgcgcc agaagaaatt aaaaaagtgg gtgcatggtt acgtgtgtca 

5601 agtggtaacg cagtcggcga agtaagacaa acattagagg ctaatatatc ggaatataaa gaattcttca 

5671 gtaatgttaa tgcggaaaca aaacatcgtg aatatggttg ggcagcaaaa catcaaaaat aggagtgata 

5741 taaatgaaat cacaacaaca agcaaaagaa tggatatata agcatgaggg ggcaggtgtt gactttgatg 

58 11 gtgcatatgg atttcaatgt atggacttat cagttgctta tgtgtattac attactgacg gtaaagttcg 

5B81 catgtggggt aatgctaaag acgcgataaa taatgacttt aaaggtttag cgacggtgta taaaaataca 

5951 ccgagcttca aacctcaatt aggggacgtt gctgtatata caaatggaca atatggacat attcaatgtg 

6021 tgttaagtgg aaatcttgat tattatacat gcttagaaca aaactggtta ggcggcggtt ttgacggttg 

6091 ggaaaaagca accattagaa cacattatta tgacggtgta actcacttta ttagacctaa attttcaggt 

6161 agtaatagca aagcattaga aacatcaaaa gtaaatacat tcggaaaatg gaaacgaaac caatacggca 

6231 cat at cat ag aaatgaaaat ggtacattta catgtggttt tttaccaata tttgcacgtg tcggtagtcc 

6301 aaaattatca gaacctaatg gctattggtt ccaaccaaac ggttatacac catataacga agtttgttta 

6371 tcagatggtt acgtatggat tggttataac tggcaaggca cacgttatta tttaccagtg cgccaatgga 

6441 atggaaaaac aggtaatagt tacagtgttg gtattccttg gggggtgttc tcataatggg tattttagcc 

6511 tttttctttg aatttagttg gaaaagatac aaataagagg tgtaaacaat ggctgataga atcgtaagaa 

6581 gtttaagaca agttgaaaca attgaacgtt tattggagga aaaaaatgag aaagttaacg aattttaagt 

6651 ttttctataa cacaccgttt acagactatc aaaacacgat tcattttaat agtaataaag aacgtgatga 

6721 ttatttttta aatggtcgtc attttaaatc gttagactat tcaaaacaac cgtataattt tatacgtgat 

6791 agaatggaaa tcaatgttga tatgcagtgg catgacgcac aaggtattaa ctacatgacg tttttatcag 

6861 attttgagga tagaagatat tacgcttttg taaaccaaat cgaatacgtg aatgacgttg tggttaaaat 

6931 atattttgtc attgatacca ttatgacgta tacacaaggg aatgtattag agcaactctc aaacgtcaat 

7001 attgaacgtc aacatttatc aaaacgcacg tataactata tgttaccaat gttacgtaat aatgatgatg 

7071 tgttaaaagt atcaaataaa aactatgttt ataaccaaat gcaacaatat ttggaaaatt tagtattatt 

7141 ccagtcaagc gctgatttat caaagaaatt tggtactaaa aaagagccaa acttagatac gtcaaaaggt 

7211 acgatttatg acaatatcac accaccagtc aacttatacg ttatggaata tggtgacttt attaacttta 

7281 tggataaaat gagtgcctat ccatggatta cgcaaaactt tcaaaaggtt caaatgttac ctaaagactt 

7351 tattaataca aaagacttag aggacgttaa aaccagtgaa aaaattacag gattaaaaac attaaaacag 

7421 ggtggtaaat caaaagaatg gagtctaaaa gat t tat cat taagtttctc aaatcttcaa gaga t gat gt 

7491 tatctaaaaa agatgaattt aaacatatga tacgtaatga gtatatgaca attgaatttt atgactggaa 

7561 tggaaatacg atgttactcg acgctggtaa gatttcacaa aaaactggtg ttaagttacg tacaaaatca 

7631 attattggtt atcataatga agttcgagta tatccagtag attataacag tgctgaaaac gacagaccaa 

7701 tactcgctaa aaataaagaa atattgattg atacgggttc attcttaaat acaaatataa catttaatag 

7771 ttttgcacaa gtaccaatat taatcaataa tggtatctta ggacaatcac aacaagccaa ccgacaaaaa 

7841 aatgcagaaa gtcaattaat tacaaatcgt attgataatg tattaaatgg tagcgacccg aaatcacgct 

7911 tttatgacgc tgtgagtgta gcaagtaatt taagtccaac tgctttattt ggtaagttta atgaagaata 

7981 taatttctac aaacaacaac aagctgaata taaagattta gccttacaac caccttctgt aactgaatca 

8051 gaaatgggca acgcattcca aattgcgaat agcattaacg gtttaacgat gaaaattagt gtaccgtcac 

8121 ctaaagaaat tacattttta caaaaatatt atatgttgtt tggttttgaa gtgaatgact ataattcatt 

8191 tattgaacca attaacagta tgactgtttg caattattta aaatgtacag gtacgtatac tatacgtgac 

8261 atcgacccca tgttaatgga acaattaaaa gcaattttag aatctggtgt aagattttgg cataatgacg 

8331 gttcaggtaa tccaatgtta caaaatccat taaataacaa atttagagag ggggtataat atgaacgaag 

8401 taaaattcag atttacagac tcagaagcgt ttcacatgtt tatatacgct ggggatttaa aattactcta 

84 71 ctttttattt gtattaatgt ccgttgatat tattacaggt atttcaaaag caattaaaaa taataactta 

8541 tggtcaaaaa aatcaatgag aggattttct aaaaaattat tgatattctg tattatcatt ttagcaaaca 

8611 tcattgacca gattttacaa ttaaaaggtg gtctactcat gattacaata ttttattata ttgcaaatga 

8681 gggactttct attgtagaaa attgtgcaga aatggacgta ttagtaccag aacaaattaa agataaatta 

8751 agagtcatta aaaatgatac tgaaaagagt gataacaatg aacgatcaag agaagataga taaatttacg 

8821 cattcctata ttaatgatga ttttggttta acgatagacc agttagtccc taaagtaaaa ggatatgggc 

8891 gctttaatgt atggcttggt ggtaatgaaa gtaaaatcag acaagtatta aaagcagtaa aagagatagg 

8961 tgtttcacct actctttttg ccgtatatga aaaaaatgag ggttttagtt ctggacttgg ttggttaaac 

9031 catacgtctg cacgtggtga ttatttaaca gatgctaaat tcatagcaag aaagttagta tcacaatcaa 

9101 aacaagctgg acaaccgtct tggtatgacg caggtaacat cgtccacttt gtaccacaag acgtacaaag 

9171 aaaaggtaat gcagattttg caaaaaatat gaaagcaggt acaattggac gtgcatatat tccattaaca 

9241 gcagctgcta cttgggcggc atattatcct ttaggtttga aagcatcata taacaaagta caaaactatg 

9311 gtaatccatt tttagacggt gcgaatacta ttctagcctg gggtggtaaa ttagacggta aaggtggatc 

9381 acctagtgat tcgtctgaca gtggtagtag tggtgacagt ggtagttcac tactcgcttt agcaaaacaa 

9451 gccatgcaag aattattaaa aaaaatacaa gacgcattac aatgggacgt tcatagtatt ggtagtgata 

9521 aattttttag taatgattat tttacattag aaaaaacatt taacaacaca tatcatatta aaatgacgat^. - ■* 

9591 tggtttactt gattcattaa aaaaactgat tgatagcgtt caagtagata gtgggagtag tagttctaat 

9661 cctactgatg atgacggaga ccataaacca attagtggta aatcagtcaa gccaaatgga aaaagtggtc 

9731 gtgtgattgg tggtaactgg acatatgcac agttaccaga aaaatataaa aaagcaattg gtgtaccttt 

9801 attcaaaaaa gaatacttat acaaaccagg taacatattt cctcaaacgg gtaatgcagg acaatgtaca 

9B71 gaattaacat gggcgtatat gtcacaacta catggtaaaa gacaacctac cgacgacggt caaataacaa 

9941 acggtcagcg tgtatggtac gtctataaaa agttaggtgc aaaaacaaca cataatccaa cagtaggtta 

10011 tggtttctct agtaaaccac catacttaca agcaactgca tacggtattg gtcacacagg tgttgttgta 



WO 00/32825 



PCT/IB99/02040 



268 

10081 gcagtttttg aagatggttc gtctttagtt gcaaactata atgtaccacc atatgttgca ccatcacgtg 

10151 tggtattgta tacactcatt aatggcgtac caaataatgc tggtgataat attgtattct ttagtggtat 

10221 tgcttaatta actatgctat aacgaacaca tgctagtaat gctagtaaat aaaatacaaa acataatcaa 

10291 ttttcgtaca catttttcat gttatctcaa aaagaaaagg agactgttat tttaacagtt gccttttttt 

10361 atttcatcat gttcacgttt taatatatgc aaatcagatt tgttatgtac tgaacgttca actggaaata 

10431 agtcgttaag tgaaaatgaa ccgatgtcac tttcaatata aagaatatca tcaaattgac tatggtcgaa 

10501 attttctcta gcgtctttta atataaattc acgcctcata ttaagttcat cagtaaaata ttcatcatat 

10571 acattaccac atacaatctc agttttagac ggatatatcg atattgtacc ttgctcatta tagatacttt 

10641 tattgttttc aataatggca ccgtcaaaga attgttcacg tacaaaggtt tcaaaatcga cgcttgtatc 

10711 aaaggcgttt ttcggtatac cagcagaagc aattttaatc tttccattca cttcatatgc atatttctta 

10781 tgattcagta caaacatctt atctatctgt tcgttttcaa tatcccattt acctaaggct atcgggtcga 

10851 ataaactggg gttcaataag ggtttaacaa cggatttcat atacaaacta ccagtaccgc aataaataaa 

10921 attgtcgtca atttcacctt ccgttaagta ttggaaagga accaataagt tatacaatga acgtgatgtg 

10991 acaaatgtag agaataatat attacgttca gtgtttttgt aaccgttaat gatattgtat agttcattgt 

11061 tatcatctaa acggaataag ttaaaatgtg aacgtaatgc aggcatgcca tataatccat ttaaaacgac 

11131 tttagataac ataacctcct catttgagta tgggtgttcg ttgatatcat cagtaatgtg atagtcgtaa 

11201 ggtgatgtca tattgatttt gttttttaac ttaccttgtg ttttaataaa atagttttga aaaataatat 

11271 cacgtgcatg aaagtatcca cattcatata taacaaacga attaacacgt atatgcatgc aatcaatacc 

11341 cgtaatgtct tgaatcattc ttaatgtatt tgtattgata ttaacgtaat cattatcatt attatagtat 

11411 tttacaatca tttgacgtaa tacacgtgat ttaattttaa ttaataaatc atcgttaaat acatctttat 

11481 caatcttata taatgaaaaa taattgtcat catctaaaaa agtagggatt aacgttggtt ctgaatagtg 

11551 ttcgtaaaag tataaccatg ttggaatttt ttcatgatac atcacataag gataactcga attgatgtca 

11621 atagaaaaac aaggctcatc aattagtttg tttatgtatt tggtgttata catatttaaa ccaccacgat 

11691 agaatgatct aatatagtca taaaaattca tatcatggaa atgataatgt gtataagata ttttaatatc 

11761 ttgacattgg ttgagtaact gaaaacgtgt catttcatta ttcaagtaag attccataat attcaatgaa 

11831 aatgttaatt tgttatagtc aaaatttgga aatataccac tacaatgaat acggcacata cctaatataa 

11901 tcacgtcatt atgaacgtat gtaagttgtt caggtgCgag ttttgcaaaa catttcacag catagtcata 

11971 ggcttcacta tcatccacac cattatcttt atcaaaaatc gtacaattaa aatctgtttt aagttgtgat 

12041 tctgttaaat aaccaccatc aagtaatttc ttacctaatg Ctgcaattga tgtattggtt ttcataaagt 

12111 tatcaataat attaaattta aaaccattca aaaacattgt taaatctaaa ttgattgaag atttaacacg 

12181 tttttctaaa attacatttt gatttttggc taaaatagta gcctctttca tttttaatgt gtgttcattt 

12251 tcttctgcag attttaaata tatattttcg cgtgtaatat tatcaaaata acgcatggtg tctttaagta 

12321 aaaaatgatt atcgtattta ttacagttat gtgcaatcat gataatatct gcttttgatt ttgtgattgt 

12391 atcacgtctt ttcacatacg tataaaatgc gtcataaaaa gattcgaaac tcggaaatac ttcaacatca 

12461 atttcataac cattaaacca accaattgct acagaataag taacgttttt atatttggtt ggtttttttc 

12531 gtccgttaac tttattgtac gctaatgttt ctatatccca gtacaaaatc attcgacgtt catgtttatg 

12601 atattgcatg cattctagta atcccataat cttacacacc ttttataagc catattgttt cattagatac 

12671 tttttcgtat tctccatata gttatcttcg tatatttttt cttttctttc aaactcactc atatttttct 

12741 tcatttcatt ttttatatga aattttataa ttttattcat atctaaatat aaatatctat cattatcaac 

12811 cacgtaattt ttagagcaag cattgtcaaa atgtaaattg cttggattgt agtaataacg ttccatgttt 

12881 tctttataaa acatatcatc acgtaaatag gtaacatgat tgtctatatc cctaatttta gtacaaaatt 

12951 catattgttt tgtatatggt acaacgataa tatttgtcat aaaagtagtt acattataca tgactttaat 

13021 atatttatca tcagttttga tatagaagaa atcaccgttt tgattgatgt gatttcttaa attatcatcc 

13091 gccaaattat attcgttaaa ttcaaattct ccagttgtca tagcgtcgtc atttgaatta aacgcacgtg 

13161 tgttacgttt ttcattcacg taatcgtttc gtcgcatttc taaaaaaatg tttttgtaaa gtcttgatgt 

13231 attcatttta tgcttttgta ataaattgta tatatttaaa ttggataata taggacttga aaagttgact 

13301 gcattaccta gtaaaaacat tttagggaat ccaatataat caacgttacc atggttacgg tcgattgatt 

13371 catatattgt ttttaactta tcccactcat caattaaata atcatcttca agtgctaaaa actcatcata 

13441 tataataata ggatagtgtt ttaaaaagtt agaatgatat tttaaatcag tggcactatt caaatctgta 

13511 atcacaccaa tttctttatc ttgatagata atagctaaat agtccctagc acttctgaac gtgacacgtt 

13581 ttgatttaaa tagtggattt tcatctatga tttcttcaat aaaatcacgg taagcgtcac gtaatgtata 

13651 atgacgtgat aataaagtaa attttatatc aagtttaata gctaaataaa taaaaaatga aacatagttg 

13721 aacgattttc catcagaacg gtttgaaata gatatataat aatctatatc atcattcata agttcatcaa 

13791 ctaattctat ttgattatac ttatctggga ttttttttct gacatgattg acagcatttt gataatctct 

13861 taccatgtct aaacgatttt gttttaccat gtttttgctc cttgtaatag tttatgatgt cgtttacagt 

13931 gttaaattta ttcgtcaaat gttgcataat ataaaaagtt atacctcaca tcttcatcat caatatttgt 

14001 cactggtcta tctgatttac caatttcttt atataaagta tcgatttctt taatatattt atacattgaa 

14071 gaattattat ttttagcttg taaattatat aaagcgtatt tatgcttttt agcgttttta ttattagaat 

14141 catcattacg gttatatatt tcaagaatat aatttaattt tttatgtctt gaacctctta ccaatgatac 

14211 agcatttaca tatgatacgt ttctttcttt aggaaaatag ggcagatgtg caaaatgttt ccatgtgtca 

14281 atgtacgcct cttgtaaatc tttatcatca aatttaaaat taacattact aaaatcattt aaaaataaat 

14351 ctttttcttg ctcttttcta gcttctcttt cttttttcca tctatccatt tcagacgtat gtctaaccaa 

14421 tgttatcaac ctccatataa agcataaata accattaaaa agataatata gaatataatc aatgtagtga 

14491 ataaaacacc aaatgacacg cgtatatgca gtgtcataag tatgataagt gtaattaaaa atgctaaaag 

14561 gaaaacaatg gctatgttta ataggttatt catggtcaat cactttccca ttatcgtata tgactttgtt 

14631 ttgataaata atcattaatt cgctttcaag aggtttatca aaatttgata atacgtcgtc aattgtaacg 

14701 tttaataaaa tttctcttat taattcatta cttaaataat ttctataata aaatacaagt atattaaaaa 

14771 catgtttttt aatatcaatg tcgatatcta acgtaaataa ctctttttca atttcaaaat catcatattg 

14841 tttgtcaaac tcaatataca catcacccat atttattttt actatacatt ttttattaga tgaagtaaat 

14911 ttttcaaatt tatcattata ataatctcta tttgttaaaa ggcaataaat taaattattt aatctaaaag~~ 

14981 tagttttaat tttcattttt atatctcctt aatgtattct atgatatacg cgtatttttt agtgaacagg 

15051 ttatattcat aatatgaata tacaacttta gcgtcatata aatcttcaaa cattgagatt tgatgtggaa 

15121 aatgtccttt aatctcatcg caatataata ataccgtttt gtatttacgt tccatttaaa cacctcataa 

15191 aaaatagggg ataagtatcc cctatgaaat tgtattaaaa tgatacttga ccaaaattga ttgagtaacc 

15261 tttttgacct tttttgtttt catattcata aattgtgaat tgaacttctc cagcattgat aatgtcaaca 

15331 acgtcctcat ctgctctcac ttctttaatt aattctgtta agcggttcgg taagtttacg ttatagtcat 
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15401 cagtgacgat aacaccttgt tcaccgaatt 

15471 ttttttcata ccgtattttt ctactaattc 

15541 aatctcgcta atgtgttttg gtgtcttgat 

15611 ttaaattatt tgctttctgc aattgcgatt 

156B1 tgcgtgtagt ggacaatagt ttacatgtgt 

15751 ctcgtgaagt ggtaaaaatt cctcaatgta 

15821 acacgtaagg taacaatgtc gtcaactttc 

15891 cgtttcataa aatcctttat gcatattcca 

15961 gattctggtt tagtttcgtt gtttagttca 

16031 atagttgttg gcaagccgat aataagttaa 

16101 tttattgaat agttgcaaca tttcagtata 

16171 attattatca cttcctaata aagttgaaat 

16241 tcaatgtcaa catcataaaa tgaaatttca 

16311 tcttaaaacg aaaaacatgc ttcaactcaa 

16381 tgattacata cttagtatag caaacgttta 

16451 ttttaaaact actatttaat agaagaaata 

16521 agatacataa attttgtatt tgatgaatat 

16591 ttttaagttt agtaaagaaa tgataagtaa 

16661 ggtggggt 
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ttgattcttt gtttgtgaat aatgctctaa cgatatactc 
tgatagcttg ataaattctc tttctttttc ctcaaattca 
aaaatatctt ttacgtttgt cattttattt ctcctcttat 
tgtagtaaat cattgtaata aacttgaatt gttttcgttg 
ctggtaataa ttcttttgct tgtgttttgg ttaaatgata 
ttcattatca tcatctaagt aatgaagtat ataacctttg 
attattatat cacccctttc taaaaaacgt aaacgttata 
ttgttctatt gggtcatcac cagcaatata agacaatatt 
tcattcaaga atcgaacaac agaactatta tagtttaata 
ttgcattgtc aaatgtataa gctggattcc attgaatcag 
ggcttgtcct ttttcttctg gtgcattatc aacattaacc 
tacgcgtaaa acagaattat gatttaaatc Ctcaatttca 
ttttctgttc catcaaataa cgctatacat aaacttccat 
tgttttttgt ttcattttcc atttttgtta ctccttgttt 
aaagttttgt caatagtttt tcttaaaaaa gtttaaataa 
agattttaag ttcaaatcat aattttgaat aaaagtcaat 
gtaataggtt agataagttg gttaagttgt tgcacagtat 
attcataagt tttgatttgt ataatcgttt attttaaacc 
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Table 17 



Phage 44AHJD ORFs list 
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65 
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15551..15658 


35 




69 


44AHJDORF062 


1 


428S..4389 


34 




70 


44AHJDORF063 


-3 
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Table 18 



Predicted amino acid sequences 

44AHJDORF001 

12627 atgggattactagaatgcatgcaatatcataaacatgaacgtcgaatgattttatactgggatatagaaacattagcgtacaat 

1 MGLLECMQYHKHERRMILYWD I ETLAYN 

12543 aaagttaacggacgaaaaaaaccaaccaaatataaaaacgttacttattctgtagcaattggttggtttaatggttatgaaatt 

29 KVNGRKKPTKYKNVTYSVAIGWFNGYEI 

124 S 9 gatgttgaagtatttccgagtttcgaatctttttatgacgcattttatacgtatgtgaaaagacgtgatacaatcacaaaatca 

57 dvEVPPSFESFYDAFYTYVKRRDTITKS 

12375 aaaacagatattatcatgattgcacataacrgtaataaatacgataatcattttttacttaaagacaccatgcgttattttgat 

85 ktdIIMIAHNCNKYDNHFLLKDTMRYFD 

12291 aatattacacgcgaaaatatatatttaaaatctgcagaagaaaatgaacacacactaaaaatgaaagaggctactattttagcc 

113 NITRENIYLKSAEENEHTLKMKEATILA 

12207 aaaaatcaaaatgtaattttagaaaaacgtgttaaatcttcaatcaatttagatttaacaatgtttttaaatggttttaaattt 

141 KNQNVILEKRVKSSINLDLTMFLNGFKF 

12123 aatattattgataactttatgaaaaccaatacatcaattgcaacattaggtaagaaattacttgatggtggttatttaacagaa 

XS9 N I I DKFMKTNTSIATLGKKLLDGGYLTE 

12039 tcacaacttaaaacagattttaattatacgatttttgataaagataatgatatgaatgatagtgaagcctatgactatgctgtg 

197 SQLKTDFNYTI FDKDNDHNDSEAYDYAV 

11955 aaatgttttgcaaaactcacacctgaacaacttacatacattcataatgacgtgattatattaggtatgtgccatattcattat 

225 KCFAKLTPEQLTYIHNDVI ILGMCHIHY 

11871 agtgatat at ttccaaattttgactataacaaattaacattt teat tgaatattatggaatct tact tgaataatgaaatgaca 

253 SDIFPNFDYNKLTFSLN IMESYLNNEMT 

11787 cgttttcagttactcaaccaatatcaagatattaaaatatcttatacacattatcatttccatgatatgaatttttatgactat 

281 RFQLLNQYQDIKI SYTHYHFHDMNFYDY 

11703 attaaatcattctatcgtggtggtttaaatatgtataacaccaaatacataaacaaactaattgatgagccttgtttttctatt 

309 IKSFYRGGLNMYNTKYINKLIDEPCFSI 

11619 gacatcaattcgagttatccttatgtgatgtatcatgaaaaaattccaacatggttatacttttacgaacactattcagaacca 

337 DINSSYPYVMYHEKI PTW'LYFYEHYS EP 

1153S acgttaatccctacttttttagatgatgacaattatttttcattatataagattgataaagatgtatttaacgatgatttatta 

365 TLI PTFLDDDNYFSLYKIDKDVFNDDLL 

11451 attaaaattaaatcacgtgtattacgtcaaatgattgtaaaatactataataatgataatgattacgttaatatcaatacaaat 

393 iKIKSRVLRQMIVKYYNNDNDYVNINTN 

11367 acattaagaatgattcaagacattacgggtattgattgcatgcatatacgtgttaattcgtttgttatatatgaatgtgaatac 

421 TLRMIQDITGIDCMHIRVNSFVIYECEY 

11283 tttcatgcacgtgatattatttttcaaaactattttattaaaacacaaggtaagttaaaaaacaaaatcaatatgacatcacct 

449 pHARDIIFQNYFIKTQGKLKNKINMTSP 

11199 tacgactatcacattactgatgatatcaacgaacacccatactcaaatgaggaggttatgttatctaaagtcgttttaaatgga 

477 YDYHITDDINEHPYSNEEVMLSKVVLNG 

11115 ttatatggcatacctgcattacgttcacattttaacttattccgtttagatgataacaatgaactatacaatatcattaacggt 

505 LYGIPALRSHFNLFRLDDNNELYNIING 

11031 tacaaaaacactgaacgtaatatattattctctacatttgtcacatcacgttcattgtataacttattggttcctttccaatac 

533 YKNTERNILFSTFVTSRSLYNLLVPFQY 

10947 ttaacggaaagtgaaattgacgacaattttatttattgcgatactgatagtttgtatatgaaatccgttgttaaacccttattg 

561 LTESEIDDNFIYCDTDSLYMKSVVKPLL 

10863 aaccccagtttattcgacccgatagccttaggtaaatgggatattgaaaacgaacagatagataagatgtttgtactgaatcat 

589 HPSLFDPIAL'GKWDIENEQIDKMFVLNH 

10779 aagaaatatgcatatgaagtgaatggaaagattaaaattgcttctgctggtataccgaaaaacgcctttgatacaagcgtcgat 

617 KKYAYEVNGKIKIASAGIPKNAFDTSVD 

10695 tttgaaacctttgtacgtgaacaattctttgacggtgccattattgaaaacaataaaagtatctataatgagcaaggtacaata 

645 FETFVREQFFDGAIIENNKSIYNEQGTI 

10611 tcgatatatccgtctaaaactgaaattgtatgtggtaatgtatatgatgaatattttactgatgaacttaatatgaaacgtgaa 

673 SIYPSKTEIVCGNVYDEYFTDELNMKRE 

10527 tttatattaaaagacgctagagaaaatttcgaccatagtcaatttgatgatattctttatattgaaagtgacatcggttcattt 

701 FILKDARENFDHSQFDDILYIESDIGSF 

1044 3 tcacttaacgacttatttccagttgaacgttcagtacataacaaatctgatttgcatatattaaaacgtgaacatgatgaaata 

729 SLNDLFPVERSVHNKSDLHILKREHDEI 

10359 aaaaaaggcaactgttaa 10342 

757 K K G N C * 
44AHJDORF002 

3789 atggcatataatgaaaacgattttaaatattttgatgacattcgtccatttttagacgaaatttataaaacgagagaacgttat 

1 MAYNENDFKYFDDIRPFLDEIYKTRBRY 

3873 acaccgttttacgatgatagagcagattataatactaattcaaaatcatattatgattatatttcaagattatcaaaactaat;t_ 

29 TPFYDDRADYKTNSKSYY DYI S R I* S _K_- IT' I 

3957 gaagtattagcacgtcgtatttgggactatgacaatgaattaaaaaaacgtttcaaaaattgggacgacttaatgaaagcattt 

57 EVLARRIWDYDNELKKRFKNWDDLMKAF 

4041 ccagagcaagcgaaagacttatttagaggttggttaaacgacggtacgattgacagtattattcatgacgagtttaaaaaatat 

8S PEQAKDLFRGWLNDGTIDSIIHDEFKKY 

4125 agcgcaggattaacatcggcatttgctttatttaaagttactgaaatgaaacaaatgaatgactttaaatcagaagttaaagac 

113 SAGLTSAFALFKVTEMKQMNDFKSEVKD 

4209 ttaattaaagatattgaccgtttcgttaatgggtttgaattaaatgagcttgaaccaaagtttgtgatgggctttggtggtatt 
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141 LIKDIDRFVNGFELNELEPKFVMGFGGI 

4293 cgcaacgcagttaaccaatctattaatattgataaagaaacaaatcacatgtactctacacaatccgattctcaaaaacctgaa 

169 RNAVNQSINIDKETNHMYSTQSDSQKPE 

4377 ggtttttggataaataaattaacacctagtggtgacttaattitcaagcatgcgtattgtacagggtggtcatggtacaacaatc 

197 GFWINKLTPSGDLI SSMRIVQGGHGTTI 

4461 ggattagaacgtcaatccaatggtgaaatgaaaatctggttacatcacgatggtgttgcaaaactgttacaagtcgcatataaa 

225 GLERQSNGEMKIWLHHDGVAKLLQVAYK 

4545 gataattatgtatcagatttagaagaggctaaaggtttaacagattatacaccacagtcacttttaaacaaacacacacttaca 

253 DNYVLDLEEAKGLTDYTPQSLLNKHTFT 

4 629 ccgttaattgatgaagcaaatgacaaactcattttaagattcggtgacggaacaatacaggttcgttcaagagcagacgtaaaa 

281 PLIDEANDKLILRFGDGTIQVRSRADVK 

4713 aatcacattgataatgtagaaaaagaaatgacaattgataattcagaaaacaatgataatcgttggatgcaaggcattgctgtt 

309 NHIDNVEKEMT IDNSENNDNRWMQGIAV 

4797 gatggtgatgatttatactggttaagtggtaacagttcagttaattcacatgttcaaatcggtaaatattcattaacaacaggt 

337 DGDDLYWLSGNSSVNSHVQIGKYSLTTG 

4881 caaaagatttatgattatccatttaagttatcatatcaagacggtattaatttcccacgtgataactttaaagagcctgagggt 

365 QKIYDYPFKLSYQDGINFPRDNFKEPEG 

4 965 atttgcatttatacaaatccaaaaacaaaacgtaaatcgttattacttgctatgacaaacggcggtggtggaaaacgtttccat 

393 I C I YTNPKTKRKS LL LAMTNGGGGKRFH 

5049 aatttatatggtttcttccaacttggtgagtatgaacactttgaagcattacgcgcaagaggttcacaaaactataaatitaaca 

421 NLYGFFQLGEYEHFEALRARGSQNYKLT 

5133 aaagacgacggtcgtgcattatctattccagaccatatcgacgatttaaatgacttaacgcaagctggtttttattatattgac 

449 KDDGRALSI PDHIDDLNDLTQAGFYYID 

5217 gggggtactgcagaaaaacttaagaatatgccaatgaatggtagcaagcgtataattgacgctggttgtttcattaatgtatac 

477 GGTAEKLKNMPMNGSKRI IDAGCFINVY 

5301 cctacaacacaaacattaggtacggttcaagaattaacacgtttctcaacaggtcgtaaaatggttaaaatggtgcgtggtatg 

SOS PTTQTLGTVQELTRFSTGRKMVKMVRGM 

5385 actttagacgtatttacgttaaaatgggattatggattatggacaacaatcaaaactgacgcaccatatcaagaatatttggaa 

533 TLDVFTLKWDYGLWTTIKTDAPYQEYLE 

5469 gcaagtcaatacaataactggattgcttatgtaacaacagctggtgagtattacattacaggtaaccaaatggaattatttaga 

561 ASQYNNWIAYVTTAGEYYITGNQMELFR 

5553 gacgcgccagaagaaattaaaaaagtgggtgcatggttacgtgtgtcaagtggtaacgcagtcggtgaagtaagacaaacatta 

589 DAPEEIKKVGAWLRVSSGNAVGEVRQTL 

5637 gaggctaatatatcggaatataaagaattcttcagtaatgttaatgcggaaacaaaacatcgtgaatatggttgggtagcaaaa 

617 EANISEYKEFFSNVNAETKHREYGWVAK 

5721 catcaaaaatag S732 

645 H Q K * 
44AHJDORF003 

6626 atgagaaagttaacgaattttaagtttttctataacacaccgtttacagactatcaaaacacgattcattttaatagtaataaa 

1 MRKLTNFKFFYNTPFTDYQNTIHFNSNK 

6710 gaacgtgatgattattttttaaatggtcgtcattttaaatcgttagactattcaaaacaaccgtataattttatacgtgataga 

29 ERDDYFLNGRHFKSLDYSKQPYNFIRDR 

6794 atggaaatcaatgttgatatgcagtggcatgacgcacaaggtattaactacatgacgtttttatcagattttgaggatagaaga 

57 ME INVDMQWHDAQ GINYMTFLSDFEDRR 

6878 tattacgcttttgtaaaccaaatcgaatacgtgaatgacgttgtggttaaaatatattttgtcattgataccattatgacgtat 

85 YYAFVNQIEYVNDVVVKIYFVIDTIMTY 

6962 acacaagggaatgtattagagcaactctcaaacgtcaatattgaacgtcaacatttatcaaaacgcacgtataactatatgtta 

113 TQGNVLEQLSNVNIERQHLSKRTYNYML 

704 6 ccaatgttacgtaataatgatgatgtgttaaaagtatcaaataaaaactatgtttataaccaaatgcaacaatatttggaaaat 

141 PMLRNNDDVLKVSNKNYVYNQMQQYLEN 

7130 ttagtattattccagtcaagcgctgatttatcaaagaaatttggtactaaaaaagagccaaacttagatacgtcaaaaggtacg 

169 LVLFQ5SAD L S KKFGTKKE PNLDTSKGT 

7214 atttatgacaatatcacatcaccagtcaactcatacgttatggaatatggtgactttattaactttatggataaaatgagtgcc 

197 IYDNITSPVNLYVMEYGDFINFMDKMSA 

7298 tatccatggattacgcaaaactttcaaaaggttcaaatgttacctaaagactttattaatacaaaagacttagaggacgttaaa 

225 YPWITQNFQKVQKLPKDF INTKDLEDVK 

7382 accagtgaaaaaattacaggattaaaaacattaaaacagggtggtaaatcaaaagaatggagtctaaaagatttatcattaagt 

253 TSEKITGLKTLKQGGKSKEWSLKDLSLS 

7466 ttctcaaatcttcaagagatgatgttatctaaaaaagatgaatttaaacatatgatacgeaatgagtatatgacaattgaattt 

281 FSNLQEMMLSKKDEFKHMI RNEYMTI EF 

7550 tatgactggaatggaaatacgatgttactcgacgctggtaagatttcacaaaaaactggtgttaagttacgtacaaaatcaatt 

309 YDWNGNTMLLDAGKISQKTGVKLRTKSI 

7634 attggttatcataatgaagttcgagtatatccagtagattataacagtgctgaaaacgacagaccaatactcgctaaaaataaa 

337 IGYHNEVRVYPVDYNSAENDRPILAKNK 

7718 gaaatattgattgatacgggttcattcttaaacacaaatataacatttaatagttttgcacaagtaccaatattaatcaataat 

365 EILIDTGSFLNTNITFNSFAQVPILINN 

7802 ggtatcttaggacaatcacaacaagccaaccgacaaaaaaatgcagaaagtcaattaattacaaatcgtattgataatgtatta 

393 G I LG Q S QQAN RQKNAE S Q L I T'N R I D W V . L . .. 

7886 aatggtagcgacccgaaatcacgctttcatgacgctgtgagtgcagcaagtaacttaagtccaactgcttftatttggcaagtct 

421 NGSDPKSRFYDAVSVASNLS PTALP*GKF 

7970 aatgaagaatataatttctacaaacaacaacaagctgaatataaagatttagccttacaaccaccttctgtaactgaatcagaa 

449 NEEYMFYKQQQAEYKDLALQPPSVTESE 

8054 atgggcaacgcattccaaactgcgaatagcattaacggcttaacgatgaaaattagtgcaccgtcacctaaagaaattacattt 

477 MGNAFQIANSINGLTMKISVPSPKEITF 

8138 ttacaaaaatattatatgttgtttggttttgaagtgaatgactataattcatttattgaaccaattaacagtatgactgtttgc 



WO 00/32825 PCT/IB99/02040 



274 

505 lqkYYMLFGFEVNDYNSFIEPINSMTVC 

8222 aattatctaaaatgcacaggtacgtatactatacgtgacatcgaccccatgttaatggaacaattaaaagcaattttagaatct 

533 nyLKCTGTYTIRDIDPMLMEQLKAILES 

830G ggtgtaagattttggcataatgacggttcaggtaatccaatgttacaaaatccattaaataacaaatttagagagggggtataa 

5S1 9 GVRFWHNDGSGNPMLQNPLNNKFREGV* 
44AHJDORF004 

8764 atgatactgaaaagagtgataacaatgaacgatcaagagaagatagataaatttacgcattcctatattaatgatgattttggt 

1 M IL KR VITMNDQEKIDKFTHSYINDDFG 

8848 ttaacgatagaccagttagtccctaaagtaaaaggatatgggcgctttaatgtatggcttggtggtaatgaaagtaaaatcaga 

29 LT IDQLVPKVKGYGRFNVWLGGNESKIR 

8932 caagtattaaaagcagtaaaagagataggtgtttcacctactctttttgccgcatatgaaaaaaatgagggttttagttctgga 

57 QVLKAVKEIGVSPTLFAVYEKNEGFSSG 

9016 cttggttggttaaaccatacgtctgcacgtggtgattatttaacagatgctaaattcatagcaagaaagttagtatcacaatca 

85 L G WLNHTSARGDYLTDAKF IARKLVSQS 

9100 aaacaagctggacaaccgtcttggcatgacgcaggtaacatcgtccactttgtaccacaagacgtacaaagaaaaggtaatgca 

113 KQAG QPSWYDAGNIVHFVPQD VQRKGNA 

9184 gattttgcaaaaaatatgaaagcaggtacaatcggacgtgcatatattccattaacagcagctgctacttgggcggcatattat 

141 DFAKNMKAGTIGRAYI PLTAAATWAAYY 

9268 cctttaggtttgaaagcatcatataacaaagtacaaaactatggtaatccatttttagacggtgcgaatactattctagcttgg 

169 PLGLKASYNKVQNYGNP FLDGANT I IiA.W 

9352 ggtggtaaattagacggtaaaggtggatcacctagtgattcgtctgacagtggtagcagtggtgacagtggtagttcactactc 

197 GGKLDGKGGSPSDSSDSGSSGDSGSSLL 

9436 qctttagcaaaacaagccatgcaagaattattaaaaaaaatacaagacgcattacaatgggacgttcatagtattggtagtgat 

22S ALAKQAMQELLKKIQDALQWDVHSIGSD 

9520 aaattttttagtaatgattattttacaccagaaaaaacatttaacaacacatatcacattaaaatgacgattggtttacttgat 

253 KFFSNDYFTLEKTFNNTYHI KMT ZGLLD 

9604 tcattaaaaaaactgattgatagcgttcaagtagatagtgggagtagtagttctaatcctactgatgatgacggagaccataaa 

281 SLKKLIDSVQVDSGSSSSNPTDDDGDHK 

9688 ccaattagtggtaaatcagtcaagccaaatggaaaaagtggtcgtgtgattggtggtaactggacacatgcacagttaccagaa 

309 piSGKSVKPNGKSGRVIGGNWTYAQLPE 

9772 aaatataaaaaagcaattggtgtacctttattcaaaaaagaatacttatacaaaccaggtaacatatttcctcaaacgggtaat 

337 KYKKAIGVPLFKKEYLYKPGNIFPQTGN 

9856 gcaggacaatgtacagaattaacatgggcgtatatgtcacaactacatggtaaaagacaacctaccgacgacggtcaaataaca 

365 AGQCTELTWAYMSQLHGKRQPTDDGQIT 

9940 aacggtcagcgtgtatggtacgtctataaaaagttaggtgcaaaaacaacacataatccaacagcaggttatggtttctctagt 

393 NG Q RVWYVYKKLGAKTTHNPTVGYGFSS 

10024 aaaccaccatacttacaagcaactgcatatggtattggtcacacaggtgttgttgtagcagtttttgaagatggttcgttttta 

421 KPPYLQATAYGIGHTGVVVAVFEDGSFL 

10108 gttgcaaactataatgtaccaccatatgttgcaccatcacgtgtggtattgtacacactcattaatggcgtaccaaataatgct 

449 VANYNVPPYVAPSRVVLYTL1NGVPNNA 

10192 ggtgataatattgtattctttagtggtattgcttaa 10227 

477 GDNIVFFSGIA* 

44AHJDORF005 _ . 

13890 atggtaaaacaaaatcgtttagacatggtaagagattatcaaaatgctgtcaatcatgtcagaaaaaaaatcccagataagtat 

1 mvKQNRLDMVRDYQNAVNHVRKKIPDKY 

13806 aatcaaatagaattagttgacgaactcatgaatgatgatatagattattatatacctatctcaaaccgttctgatggaaaatcg 

29 nqielVDELMNDDIDYYISISNRSDGKS 

13722 ttcaactatgcttcatttttcacttatctagctattaaacttgatataaaatttactttattatcacgtcattatacattacgt 

57 pNYVSFFIYLAIKLDIKFTLLSRHYTLR 

13638 gacgcttaccgtgattttattgaagaaatcatagatgaaaatccactatttaaatcaaaacgtgtcacgttcagaagtgctagg 

85 Bayrdfieeiidenplfkskrvtfrsar 

13S54 gactatttagctattatctatcaagataaagaaattggtgtgattacagatttgaatagtgccactgatttaaaatatcattct 

ii3 Sylaiiyqdkeigvitdlnsatdlkyhs 

13470 aactttttaaaacactatcctattattatatatgatgagtttttagcacttgaagatgattatttaactgatgagtgggataag 

141 nflkhypiiiydeflaleddylidewdk 

13386 ttaaaaacaacacatgaatcaaccgaccgtaaccatggtaacgttgattatattggattccctaaaatgtttttactaggtaat 

169 lktiyesidrnhgnvdyigfpkmfllgn 

13302 gcagtcaacttctcaagccctatattatccaatttaaatatatacaatttatcacaaaagcacaaaatgaatacatcaagactt 

197 avnfsspilsnlniynllqkhkmntsrl 

13218 tacaaaaacatttctttagaaatgcgacgaaacgattacgtgaatgaaaaacgtaacacacgtgcgtttaattcaaatgacgac 

225 ykniflemrrndyvnekrntrafnsndd 

13134 gctatgacaactggagaatttgaatttaacgaatataacttggcggatgataatttaagaaatcacatcaatcaaaacggtgat 

253 amTTGEFEFNEYNLADDNLRNHINQNGD 

13050 ttcttctatatcaaaactgatgataaatatattaaagtcatgcataatgtaactacttttatgacaaatattatcgttgtacca 

281 FFYI KTDDKYIKVKYNVTTFMTNIIVVP 

12966 tatacaaaacaatacgaattttgtactaaaattagggacatagacaatcatgttacccattcacgtgatgatatgttttataaa 

309 YTKQYEFCTKIRDIDNHVTYLRDDMFYK 

12 882 gaaaacatggaacgttattactacaacccaagcaatttacattttgacaatgcttactctaaaaattacgtggtt^ataatgat 
337 ENMERYYYNPSNLHFDNAYSKNYVVDND 
12798 agatatttatatttagatatgaataaaattataaaatttcatataaaaaatgaaatgaagaaaaatatgagtgagtttgaaaga 
365 RYLYLDMNKIIKFHIKNEMKKNMSEFER 
12714 aaagaaaaaatatacgaagataactatatagagaatacgaaaaagcatctaatgaaacaatatggcttataa 12643 
393 KEKIYEDNYIENTKKYLMKQYGL* 
44AHJDORF006 
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803 atggcacaacaatctacaaaaaatgaaactgcacttttagtagcaaagtcagctaaatcagcgttacaagattttaatcatgat 

1 MAQQSTKNETALLVAKSAKSALQDFNHD 

867 tattcaaaatcttggacatttggcgacaaatgggataattcaaatacaatgttcgaaacatttgtaaataaatatttattccct 

29 YSKSWTFGDKWDNSNTMFETFVNKYLFP 

971 aagattaatgagactttattaatcgatattgcattaggtaatcgttttaattggttagctaaagagcaagattttattggacaa 

57 KINETLLIDIALGNRFNWLAKEQDFIGQ 

1055 tatagtgaagaatacgtgattatggacacagtaccaattaacatggacttatctaaaaatgaggaattaatgttgaaacgtaat 

85 YSEEYVIMDTVPINMDLSKNEELMLKRN 

1139 tatccacgtatggcaactaagttatatggcaacggaattgtgaagaaacaaaaattcacattaaacaacaatgatacacgtttc 

113 YPRMATKLYGNGIVKKQKFTLNNNDTRF 

1223 aatttccaaacattagcagacgcaactaattacgctttaggtgtatacaaaaagaaaatttctgatattaatgtattagaagaa 

141 NFQTLADATNYALGVYKKKISDINVLEE 

1307 aaagaaatgcgtgcaatgttagttgat tact cat tgaatcaattatccgaaacaaatgtacgtaaagcaacatcaaaagaagat 

169 KEMRAMLVDYSLNQLSETNVRKATSKED 

1391 ttagcaagcaaagtttttgaagcaatcctaaacttacaaaacaacagtgctaaatataatgaagtacatcgtgcatcaggtggt 

197 LAS KVFEAILNLQNNSAKYNEVHRASGG 

1475 gcaattggacaatatacaactgtatcaaaattaaaagatattgtgattttaacaacagattcattaaaatcttatcttttagat 

225 AIGQYTTVSKLKDIVILTTDS LKSYLLD 

1559 actaagattgcaaacacattccagattgcaggcattgatttcacagatcacgttattagttttgacgacttaggtggcgtgttt 

253 TKIANTFQIAGIDFTDHVISFDDLGGVF 

1643 aaagtaacaaaagaatttaagttacaaaaccaagattcaattgactttttacgtgcgtatggagattatcaatcacaattagga 

281 KVTKE FKLQNQDS IDFLRAYGDYQSQLG 

1727 gatacaattccagttggtgctgtatttacttatgatgtatctaaacttaaagagtttactggcaacgttgaagaaattaaacca 

309 DTIPVGAVFTYDVSKLKEFTGNVEEIKP 

1811 aaatcagatttatatgcgtttattttggatattaattcaattaaatataaacgttacacaaaaggtatgttaaaaccaccattc 

337 KSDLYAFILDINS IKYKRYTKGMLK PPF 

1895 cataaccctgaatttgatgaagttacacactggattcattactattcatttaaagccattagtccattctttaataaaatttta 

365 HNPEFDEVTHWIHY YSFKAISPFFNKIL 

1979 attactgaccaagatgtaaatccaaaaccagaggaagaattacaagaataa 2029 

393 ITDQDVNPKPEEELQE* 
44AHJDORF007 

2044 atgaacaacgataaaagaggtttaaacgttgagttatcaaaggaaatcagcaaaagagttgttgaacatcgcaacagatttaaa 

1 MNNDKRGLNVELSKEISKRVVEHRNRFK 

2128 cgtcttatgtttaatcgttatttggaatttttaccgctactaatcaactataccaatcgtgatacggttggtatagattttatt 

29 RLMFNRYLEFLPLLINYTNRDTVGIDFI 

2212 cagttagaatcagctttaagacaaaacattaatgtagttgttggtgaagctagaaataagcaaattatgattcttggttatgta 

57 QLESAltRQNINVVVGEARNKQIMILGYV 

2296 aataacacttactttaatcaagcaccaaatttttcatcaaactttaatttccaatttcaaaaacgattaactaaagaagatata 

85 NNTYFNQAPNFSSNFNFQFQKRL TKE DI 

2380 tattttattgtacctgactatttaatacctgatgattgtctacaaattcataagctatatgataactgtatgagtggtaacttt 

113 YFIVPDYLI PDDCLQIHKLYDNCMSGNF 

24 64 gttgtcacgcaaaataaaccaattcaatataatagtgatatagaaattatagaacattatactgatgaattagcagaagttgct 

141 VVMQNKPIQYNSDIEIIEHYTDELAEVA 

2548 ttatctcgcttttctttaatcatgcaagcaaaatttagcaagatatttaaatcagaaattaatgacgagtcaatcaatcaactt 

169 LSRFSLIMQAKFSKIFKSEINDESINQL 

2632 gtgtccgaaatatataacggtgcaccatttgttaaaatgtcacctatgtttaatgcagatgacgatatcattgatttaacaagt 

197 VSEIYNGAPFVKMSPMFNADDDI IDLTS 

2716 aatagcgtaatcccagcattaactgaaatgaaacgggaatatcaaaacaaaattagtgaattaagtaactatttaggcattaat 

225 NSVIPALTEMKREYQNKISELSNYLGIN 

2800 tcattagccgttgataaagaaagcggtgtttcagacgaagaggcaaaaagtaatcgtggatttaccacatcaaacagtaatatc 

253 SLAVDKESGVSDEEAKSNRGFTTSNSNI 

2884 tatttaaaaggtcgtgaaccaattacgtttttatcaaagcgttatggtttagatattaaaccgtattacgatgatgaaacaacg 

281 YLKGREPITF LSKRYGLDIKPYYDDETT 

2968 tctaaaatatcaatggtagacacactttttaaagatgaaagcagtgatataaatggctag 3027 

309 SKI SMVDTLFKDESSDING * 
44AHJDORF008 

3020 atggctagatacacaatgactttatacgatttcattaaatcagaattgattaaaaaaggtttcaatgaatttgtaaatgataat 

1 MARYTMTLYDFIKSELIKKGFNEFVNDN 

3104 aaattaacgttttatgatgatgaatttcaattcatgcaaaaaatgctgaagttcgacaaagacgttttagctatcgttaatgaa 

29 KLTFYDDEFQFMQKMLKFDKDVLAIVNE 

3188 aaagtatttaaaggtttttcattgaaagatgaattatcagatttactttttaaaaaatcatttacgattcattttttagataga 

57 KVFKGFSLKDELSDLLFKKSFTIHFLDR 

3272 gaaatcaacagacaaacagttgaagcatttggcatgcaagtgattactgtatgtattacacatgaggattatttaaatgtggtt 

85 EINRQTVEAFGMQVITVCITHEDYLNVV 

3356 tat teat caagtgaagttgaaaaat act tacaatcacaaggcttcacagaacacaatgaagatacaacaagtaacactgatgaa 

113 YSSSEVEKYLQSQGFTEHNEDTTSNTDE 

344 0 acatcgaatcaaaatgctacatctttagacaattcaactggcatgactgcaaacagaaacgcttatgtgtcattaccacaaagt 

141 TSNQNATSLDNSTGMTANRNAYVSLPQS 

3524 gaggttaacattgatgttgataatacaacgttacgattcgctgataataatacgattgataacggtaaaactgtgaataaatcg - 

169 EVNIDVDNTTLRFADNNTIDNGKTV N-_K~"S 

3608 agtaacgaaagtaatcaaaacgcaaaacgtaatcaaaatcaaaaaggtaatgcaaaaggtacacaattcactaagcagtattta 

197 SNESNQNAKRNQNQKGNAKGTQFTKQYL 

3692 attgataatattgataaagcgtacgatttaagaaagaaaattttaaatgaatttgataaaaaatgttttttacaaatttggtag 
3775 

225 IDNIDKAYDLRKKILNEFDKKCFLQIW* 
44AHJDORF009 
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5744 atgaaatcacaacaacaagcaaaagaatggatatataagcatgagggggcaggtgttgactttgatggtgcatatggatttcaa 

1 MKSQQQAKEW I YKHEGAGVDFDGAYGFQ 

5828 tgtatggacttatcagttgcttatgtgtattacattactgacggtaaagttcgcatgtggggtaatgctaaagacgcgataaat 

29 CMDLSVAYVYY ITDGKVRMWGNAKDAIN 

5912 aatgactttaaaggtttagcgacggtgtataaaaatacaccgagctttaaacctcaattaggggacgttgctgtatatacaaat 

57 NDFKGLATVYKNTPSFKPQLGDVAVYTN 

5996 ggacaatatggacatattcaatgtgtgttaagtggaaatcttgattattatacatgcttagaacaaaactggttaggcggcggt 

85 GQYGHIQCVLSGNLDYYTCLEQNWLGGG 

6080 tttgacggttgggaaaaagcaaccattagaacacattattatgacggtgtaactcactttattagacctaaattttcaggtagt 

113 FDGWEKATIRTHYYDGVTHFIRPKFSGS 

6164 aatagcaaagcattagaaacatcaaaagtaaatacatttggaaaatggaaacgaaaccaatacggcacatattatagaaatgaa 

141 NSKALETSKVNTFGKWKRNQYGTYYRNE 

6248 aatggtacatttacatgtggttttttaccaatatttgcacgtgtcggtagcccaaaattatcagaacctaatggctattggttc 

169 NGTFTCGFLPI FARVGSPKLSEPNGYWF 

6332 caaccaaacggttatacaccatataacgaagtttgtttatcagatggttacgtatggattggttataactggcaaggcacacgt 

197 QPNGYTPYNEVCLSDGYVWIGYNWQGTR 

6416 tattatttaccagtgcgccaatggaatggaaaaacaggtaatagttacagtgttggtattccttggggggtgttctcataa 64 96 

225 YYLPVRQMNGKTGNSYSVGI PWGVFS* 
44AHJDORF010 

14420 ttggttagacatacgtctgaaatggatagatggaaaaaagaaagagaagctagaaaagagcaagaaaaagatttatttttaaat 

1 LVRHTSEMDRWKKEREARKEQEKDLFLN 

14336 gattttagtaatgttaattttaaatttgatgataaagatttacaagaggcgtacattgacacatggaaacattttgcacatctg 

29 DFSNVNFKFDDKDLQEAYIDTWKHFAHL 

14252 ccctattttcctaaagaaagaaacgtatcatatgtaaatgctgtatcattggtaagaggttcaagacataaaaaattaaattat 

57 PY F PKE RN'VS YVNAVS L V R G S R H KKLNY 

14168 attcttgaaatatataaccgtaatgatgattctaataataaaaacgctaaaaagcataaatacgctttatataatttacaagct 

85 ILEIYNRNDDSNNKNAKKHKYALYNLQA 

14084 aaaaataataattcttcaatgtataaatatattaaagaaatcgatactttatataaagaaattggtaaatcagatagaccagtg 

113 KNNNSSMYKYI KEIDTLYKE IGKSDRPV 

14000 acaaatattgatgatgaagatgtgaggtataactttttatattatgcaacatttgacgaataa 13938 

141 TNIDDEDVRYNFLYYATFDE* 
44AHJDORF011 

15593 atgacaaacgtaaaagatattttatcaagacaccaaaacacattagcgagatttgaatttgaggaaaaagaaagagaatttatc 

1 MTNVKDILSRHQNTLARFEFEEKEREFI 

15509 aaactatcagaattagtagaaaaatacggtatgaaaaaagagtatatcgttagagcattattcacaaacaaagaatcaaaattc 

29 KLSELVEKYGMKKEYIVRALFTNKESKF 

15425 ggtgaacaaggtgttatcgtcactgatgactataacgtaaacttaccgaaccacttaacagaattaattaaagaaatgagagca 

57 GEQGVIVTDDYNVNLPNHLTELIKEMRA 

15341 gatgaggacgttgttgacattatcaatgctggagaagttcaattcacaatttatgaatatgaaaacaaaaaaggtcaaaaaggt 

85 OEDVVDI INAGEVQFTIYEYENKKGQKG 

15257 tactcaatcaattttggtcaagtatcattttaa 15225 

113 YSINFGQVSF* 
44AHJDORF012 

8391 atgaacgaagtaaaattcagatttacagactcagaagcgtttcacatgtttatatacgctggggatttaaaattactctacttt 

1 MNEVKFRFTDSEAFHMFIYAGDLKLLYF 

8475 ttatttgtattaatgttcgttgatattattacaggtatttcaaaagcaattaaaaataataacttatggtcaaaaaaatcaatg 

29 LFVLMFVDI ITGI SKAIKNNNLWSKKSM 

8559 agaggattttctaaaaaattattgatattctgtattatcattttagcaaacatcattgaccagattttacaattaaaaggtggt 

57 RGFSKKLLIFCIIILANIIDQILQLKGG 

8643 ctactcatgattacaatattttattatattgcaaatgagggactttctattgtagaaaattgtgcagaaatggacgtattagta 

85 LLMITIFYYIANEGLSIVENCAEMDVLV 

8727 ccagaacaaattaaagataaattaagagtcattaaaaatgatactgaaaagagtgataacaatgaacgatcaagagaagataga 

113 PEQIKDKLRV'I KNDTEKSDNNERSREDR 

8811 taa 8813 

141 * 
44AHJDORF013 

14996 atgaaaattaaaactacttttagattaaataatttaatttattaccttttaacaaatagagattattataatgataaatttgaa 

1 MKIKTTFRLNNLIYYLLTNRDYYNDKFE 

14 912 aaatttacttcatctaataaaaaatgtatagtaaaaataaatatgggtgatgtgtatattgagtttgacaaacaatatgatgat 

29 KFTSSNKKCIVKINMGDVYI EFDKQYDD 

14828 tttgaaattgaaaaagagttatttacgttagatatcgacattgatattaaaaaacatgtttttaatatacttgtattttattat 

57 PEIEKELFTLDIDIDIKKHVFNILVFYY 

14744 agaaattatttaagtaatgaattaataagagaaattttattaaacgttacaattgacgacgtattatcaaattttgataaacct 

85 RNYLSNELIREILLNVTIDDVLSNFDKP 

14660 cttgaaagcgaattaatgattatttatcaaaacaaagtcatatacgataatgggaaagtgattgaccatgaataa 14 586 

113 LESELMIIYQNKVIYDNGKVIDHE* 
44AHJDORF113 

199 atgacagaatttgatgaaatcgtaaaaccagacgacaaagaagaaacttcagaatcaactgaagaaaatttagaatcaactgaa 

1 MTEFDEIVKPDDKEETSESTEE'NLE STE. T 

283 gaaacttcagaatcaactgaagaatcaactgaagaaccaactgaagaatcaactgaagataaaacagtagaaacaajt£ga*agaa 

29 ETSESTEESTEESTEESTEDKTVET-IEE 

367 gaaaatgaaaacaaattagaacctactacaacagatgaagatagttcgaaatttgaccctgttgtattagaacaacgtattgct 

57 ENENKLEPTTTDEDSSKFDPVVLEQRIA 

4 51 teat tagaacaacaagtgact act tttttatcttcacaaatgcaacaaccacaacaagtacaacaaacacaatcagatgtaaca 

85 SLEQQVTTFLSSQMQQPQQVQQTQSDVT 

535 gaatcaaacaaagaagataacgactattcagatgaagaactagttgataagttagatttagattag 600 
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113 ESNKEDNDYSDEELVDKLDLD* 
44AHJDORF114 

16172 atggttaatgttgataatgcaccagaagaaaaaggacaagcctatactgaaatgttgcaactattcaataaactgattcaatgg 

1 MVNVDNAPEEKGQAYTEMLQLFNKLIQW 

16088 aatccagcttatacatttgacaatgcaattaacttattatcggcttgccaacaactattattaaactataatagttctgttgtt 

29 NPAYTFDNA INLLSACQQLLLNYNSSVV 

16004 caattcttaaatgatgaactaaacaacgaaactaaaccagaatcaatattgfccttatattgctggtgatgacccaatagaacaa 

57 QFLNDELNNETKPESILSYIAGDDPIEQ 

15920 tggaatatgcataaaggattttatgaaacgtataacgtttacgttttttag 15870 

85 WNMHKGFYETYNVYVF* 
44AHJDORF014 

6243 atgaaaatggtacatttacatgtggttttttaccaatatttgcacgtgtcggtagtccaaaattatcagaacctaatggctatt 

1 MKMVHLHVVFYQYLHVSVVQNYQNLMAI 

6327 ggttccaaccaaacggttatacaccatataacgaagtttgtttatcagatggttacgtatggattggttataactggcaaggca 

29 GSNQTVIHHITKFVYQMVTYGLVITGKA 

6411 cacgttattatttaccagtgcgccaatggaatggaaaaacaggtaatagttacagtgttggtattccttggggggtgttctcat 

57 HVI IYQCANGMEKQVIVTVLVF LGGCSH 

6495 aatgggtattttagcctttttctttga 6521 

85 NGYFSLFL* 
44AHJDORF015 

15403 gtgacgataacaccttgttcaccgaattttgattctttgtttgtgaataatgctctaacgatatactcttttttcataccgtat 

1 VTITPCSPNFDSIiFVNNALTIYSFFIPY 

15487 ttttctactaattctgacagtttgataaattctctttctttttcctcaaattcaaatctcgctaatgtgttttggtgtcttgat 

29 FSTNSDSLINSLSFSSNSNLANVFWCLD 

15571 aaaatatcttttacgtttgtcattttatttctcctcttatttaaattatttgctttctgcaattgcgatttgtag 15645 

57 KISFTFVILFLLLFKLFAFCNCDL* 
44AHJDORF016 

15852 atgaaagttgacgacattgttaccttacgtgtcaaaggttatatacttcattacttagatgatgataatgaatacattgaggaa 

1 MKVDDIVTLRVKGYILHYLDDDNEYIEE 

15768 tttttaccacttcacgagtatcatttaaccaaaacacaagcaaaagaattattaccagacacatgtaaactattgtccactaca 

29 FLPLHEYHLTKTQAKELLPDTCKLLSTT 

15684 cgcacaacgaaaacaattcaagtttattacaatgatttactacaaatcgcaattgcagaaagcaaataa 15616 

57 RTTKTIQVYYNDLLQIAIAESK* 
44AHJDORF017 

10757 atggaaagattaaaattgcttctgctggtataccgaaaaacgcctttgatacaagcgtcgattttgaaacctttgtacgtgaac 

1 MERIiKLLLLVYRKTPLIQASILKPLYVN 

10673 aattctttgacggtgccattattgaaaacaataaaagtatctataatgagcaaggtacaatatcgatatatccgtctaaaactg 

29 NSLTVPLLKTIKVSIMSKVQYRYIRLKL 

10589 aaattgtatgtggtaatgtatatgatgaatattttactgatgaacttaatatga 10536 

57 KLYVVMYMMNILLMNLI* 
44AHJDORF018 

1098 atgttaattggtactgtgtccataaCcacgtattcttcactatattgtccaataaaatcttgctctttagctaaccaattaaaa 

1 MLIGTVSIITYSSLYCPIKSCSLANQLK 

1014 cgattacctaatgcaatatcgattaataaagtctcattaatcttagggaataaatatttatttacaaatgtttcgaacattgta 

29 RLPNAISINKVSLILGNKYLFTNVSNIV 

930 tttgaattatcccatttgtcgccaaatgtccaagattttgaataa 886 

57 FELSHLSPNVQDFE* 
44AHJDORF019 

9836 atgttacctggtttgtataagtattcttttttgaataaaggtacaccaattgcttttttatatttttctggtaactgtgcatat 

1 MLPGLYKYSFLNKGTPIAFLYFSGNCAY 

9752 gtccagttaccaccaatcacacgaccactttttccatttggcttgactgatttaccactaattggtttatggtctccgtcatca 

29 VQLPPITRPLFPFGLTDLPLIGLWSPSS 

9668 tcagtaggattagaactactactcccactatctacttga 9630 

57 SVGLELLLPLST* 

44AHJDORF121 

16362 atggaaaatgaaacaaaaaacattgagttgaagcatgtttttcgttttaagaatggaagtttatgtatagcgttatttgataga 

1 MENETKNI ELKHVFRFKNGSLC I ALFDR 

16278 acagaaaatgaaatttcattttatgatgttgacattgatgaaattgaagatttaaaecataattctgttttacgcgtaatttca 

29 TENEISFYDVDIDEIEDLNHNSVLRVIS 

16194 actttattaggaagtgataataatggttaa 16165 

57 TLLGSDNNG* 
44AHJDORF020 

13865 atgtctaaacgattttgttctaccatgtttttgctccctgtaatagttcatgatgtcgtttacagtgttaaatttattcgtcaa 

1 MSKRFCFTMFLLLVIVYDVVYSVKFIRQ 

13 94 9 atgttgcataatataaaaagttatacctcacatcttcatcatcaatatttgtcactggtctatctgatttaccaatttctttat 

29 MLHNIKSYTSHLHHQYLSLVYL I YQFLY 

14033 ataaagtatcgatttctttaa 14 053 

57 IKYRFL* ...... 

44AHJDORF123 ' -* 

614 atgtatgagggaaacaacatgcgttctatgatgggtacatcatatgaagattcaagattaaataaacgaacagaattaaatgaa 

1 MYEGNNMRSMMGTSYEDSRLNKRTELNE 

698 aacatgtcaattgatacaaataaaagtgaagatagttatggtgtacaaattcattcactttcaaaacaatcatttacaggtgac 

29 NMSIDTNKSEDSYGVQIHSLSKQSFTGD 

782 gttgaggaggaataa 796 

57 V E E E * 



WO 00/32825 



PCT/IB99/02040 



278 

44AHJDORF021 

5816 atgcaccatcaaagtcaacacctgccccctcacgcttatatatccattcttttgcttgttgttgtgatttcatttatatcactc 
1 MHHQSQHLPPHAYISILLLVVVI SFISL 

5732 ctattcttgatgttctgctacccaaccatattcacgatgttttgtttccgcattaacattactgaagaattctttatattccga 
29 LFLMFCYPTIFTMFCFRINITEEFFIFR 
5648 tatattagcctctaa 5634 
57 Y I S L * 

44AHJDORF022 

8611 atgtttgctaaaatgataatacagaatatcaataattttttagaaaatcctctcattgatttttttgaccataagttattattt 
1 MFAKMI IQNINNFLENPLIDFFDHKLLF 

8527 ttaattgcttttgaaatacctgtaataatatcaacgaacattaatacaaataaaaagtag 6468 
29 LIAFEIPVIISTNINTNKK* 
44AHJDORF023 

6494 atgagaacaccccccaaggaataccaacactgtaactattacctgtttttccattccattggcgcactggtaaataataacgtg 
1 MRTPPKEYQHCNYYIiFFHSIGALVNNNV 
6410 tgccttgccagtitacaaccaacccatacgtaaccatctgataaacaaacttcgtCatatggtgrataaccgtctggttggaacc 
29 CLASYNQSIRNHLINKLRYMVYNRLVGT 
6326 aatagccattag 6315 
57 N S H * 

44AHJDORF024 

14275 gtgtcaacgcacgcctcttgtaaatctttatcatcaaatttaaaattaacattactaaaatcattcaaaaataaatctttttct 
1 VSMYASCKSLSSNLKLTLLKSFKNKSF S 

14359 tgctcttCCctagcttctctttcttttttccatctatccatttcagacgtatgtctaaccaatgtitatcaacctccatataaag 
29 CSFLASLSFFHLSISDVCLTNVINLHIK 
14443 cataaataa 14451 
57 H K * 

44AHJDORF025 

15175 atggaacgtaaatacaaaacggtattattatattgcgatgagattaaaggacattttccacatcaaatctcaatgtttgaagat 
1 MERKYKTVLLYCDEIKGHFPHQI SMFED 

15091 ttatatgacgctaaagttgtatattcatattatgaatataacctgttcactaaaaaatacgcgtatatcatagaatacattaag 
29 LYDAKVVYSYYEYNLFTKKYAYI IEYIK 

15007 gagatataa 14999 
57 EI* 
44AHJDORF026 

14593 atgaataacctattaaacatagccattgttttccttttagcatttttaattacacttatcatacttatgacactgcatatacgc 
1 MNNLLNIAIVFLLAFLITLI ILMTLHIR 

14509 gtgtcatttggtgttttattcactacattgattatattctatattatctttttaatggttatttatgctttatatggaggttga 
14426 

29 VSFGVLFTTLI IFYI I FLMVIYAL YGG* 

44AHJDORF027 

12916 atgattgtctatatccctaattttagtacaaaattcatattgttttgtatatggtacaacgataatatttgtcataaaagtagt 
1 MI VYI PNFSTKFILFCIWYNDNI CHKSS 

13000 tacattatacatgactttaatatatttatcatcagttttgatatagaagaaatcaccgttttgattgatgtgatttcttaa 
13080 

29 YIIHDFNIFIISFDIEEITVLIDVIS* 
44AHJDORF029 

15183 gtgtttaaatggaacgtaaatacaaaacggtattattatattgcgatgagattaaaggacattttccacatcaaatctcaatgt 
1 VFKWNVNTKRYYYIAMRLKDIFHIKSQC 
15099 ttgaagatttatatgacgctaaagttgtatattcatattatgaatataacctgttcactaaaaaatacgcgtatatcatag 
15019 

29 LKIYMTLKLYIHIMNITCSLKNTRIS* 



44AHJDORF02B 

9235 atggaatatatgcacgtccaattgtacctgctttcatattttttgcaaaatctgcattaccttttctttgtacgtcttgtggta 
1 MEYMHVQLYLLSYFLQNLHYLFFVRLVV 
9151 caaagtggacgatgttacctgcgtcataccaagacggttgtccagcttgttttgattgtgatactaactttcttgctatga 9071 
29 QSGRCYLRHTKTVVQLVLIVI LTFLL* 

44AHJDORF030 

14 4 87 gtgaataaaacaccaaatgacacgcgtatatgcagtgtcataagtatgataagtgtaattaaaaatgctaaaaggaaaacaatg 
1 VNKTPNDTRICSVISMI SVI KNAKRKTM 

14571 gctatgtttaataggttattcatggtcaatcactttcccattatcgtatatgactttgtttcgataaataatcattaa 14648 
29 AMFNRLFMVNHFPIIVYDFVLINNH* 
44AHJDORF031 

11039 acgatattgtatagttcattgctatcatctaaacggaataagttaaaatgtgaacgtaatgcaggtatgccatataatccattt 
1 MI LYSSLLSSKRNKLKCERNAGM PYNPF 

11123 aaaacgactttagataacataacctcctcatttgagtatgggtgttcgttgatatcatcagtaatgtga 11191 
29 KTTLDN1TSSFEYGCSLISSVM* 
44AHJD0RF135 

693 atgaaaacatgtcaattgatacaaacaaaagtgaagatagttatggtgtacaaactcattcacttccaaaacaatoattCacag ' 
1 MKTCQLIQI KVKIVMVYKFI H F Q N H L Q 

777 gtgacgttgaggaggaataataaattatggcacaacaatctacaaaaaatgaaactgcacttttag 842 
29 VTLRRNNKLWHNNLQKMKLHF* 
44AHJDORF033 

3795 atgccattatttaaccacctctaccaaatttgtaaaaaacattttttatcaaattcatttaaaattttctttcttaaatcgtac 
1 MPLFNHLYQICKKHFLSNSFKI FFLKSY 
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3711 gctttatcaatattatcaattaaatactgcttagtgaattgtgtaccttttgcattacctttttga 3646 
29 ALSILSIKYCLVNCVPFALPF* 
44AHJDORF032 

94 55 atggcttgttttgctaaagcgagtagtgaaccaccactgtcaccactactaccactgtcagacgaatcactaggtgatccacct 
1 MACFAKASSELPLSPLLPLSDESLGDPP 
9371 ttaccgtctaatttaccaccccaagctagaatagtattcgcaccgtctaaaaatggattaccatag 9306 
29 LPSNLPPQARIVFAPSKNGLP* 
44AHJDORF034 

14146 atgatgattctaataataaaaacgctaaaaagcataaatacgctttatataatttacaagctaaaaataataattcttcaatgt 
1 MMILI IKTLKSINTLYI IYKLKIIILQC 

14062 ataaatatattaaagaaatcgatactttatataaagaaactggtaaatcagatagaccagtga 14 000 
29 INILKKS ILYIKKLVNQIDQ* 

44AHJDORF035 

13957 atgcaacatttgacgaataaatttaacactgtaaacgacatcataaactattacaaggagcaaaaacatggtaaaacaaaatcg 
1 MQHLTNKFNTVNDIINYYKEQKHGKTKS 
13873 tttagacatggtaagagattatcaaaatgctgtcaatcatgtcagaaaaaaaatcccagataa 13811 
29 FRHGKRLSKCCQSCQKKNPR*. 
44AHJDORF036 

10165 gtgtatacaataccacacgtgatggtgcaacatatggtggtacattatagtttgcaactaaaaacgaaccatcttcaaaaactg 
1 VYTIPHVMVQHMVVHYSLQLKTNHLQKL 
10081 ctacaacaacacctgtgtgaccaataccatatgcagttgcttgtaagtatggtggtttactag 10019 
29 LQQHLCDQYHMQLL.VSMVVY* 
44AHJDORF037 

14788 atgtcgatatctaacgtaaataactctttttcaatttcaaaatcatcatattgtttgtcaaactcaacatacacatcacccata 
1 MSISNVNNSFSISKSSYCLSNSIYTSPI 
14872 tttatttttactatacattttttattagatgaagtaaatttttcaaatttatcattataa 14931 
29 FIFTIHFLLDEVNFSNLSL* 
44AHJDORF038 

3671 gtgtaccttttgcattacccttttgattttgattacgttttgcgttttgattactttcgttactcgatttattcacagttttac 
1 VYLLHYLFDFDYVLRFDYFRYS IYSQFY 

3587 cgttatcaatcgtattattatcagcgaatcgtaacgttgtattatcaacatcaatgttaa 3528 
29 RYQSYYYQRIVTLYYQHQC* 
44AHJDORF039 

1743 gtgctgtatttacttatgatgtatctaaacttaaagagtttactggcaacgttgaagaaattaaaccaaaatcagatttatatg 
1 VLYLLMMYLNLKSLLATLKKLNQNQIYM 
1827 cgtttattttggatattaattcaattaaatataaacgttacacaaaaggtatgttaa 1883 
29 RLFWILIQLNINVTQKVC* 
44AHJDORF040 

9740 gtggtaactggacatatgcacagttaccagaaaaatataaaaaagcaattggtgtacctttattcaaaaaagaatacttataca 
1 VVTGHMHSYQKNIKKQLVYLYSKKNTYT 
9824 aaccaggtaacatatttcctcaaacgggtaatgcaggacaatgtacagaattaa 9877 
29 NQVTYFLKRVMQDNVQN* 
44AKJDORF041 

15836 atgtcgtcaactttcattattatatcactcctttctaaaaaacgtaaacgttatacgttccataaaatcctttatgcatattcc 
1 MSSTFIIISLLSKKRKRYTFHK1LYAYS 
15920 attgttctattgggtcatcaccagcaatataagacaatattgattctggtttag 15973 
29 IVLLGHHQQYKTILILV* 

44AHJD0RF04 2 

5151 atgcacgaccgtcgtcttttgttaatttatagttttgtgaacctcttgcgcgtaatgcttcaaagtgttcatactcaccaagtt 
1 MHDRRLLLIYSFVNIiLRVMLQSVHTHQV 
5067 ggaagaaaccatataaattatggaaacgttttccaccaccgccgtttgtcatag 5014 
29 GRNHINYGNV FHHRRLS* 

44AHJDORF043 

4539 atgcgacttgtaacagttttgcaacaccatcgtgatgtaaccagattttcatttcaccattggattgacgttctaatccgattg 
1 MRLVTVLQHHRDVTRPSFHHWIDVLIRL 
4455 ttgtaccatgaccaccctgtacaatacgcatgcttgaaattaagtcaccactag 4402 
29 LYHDHPVQYACLKLSHH* 
44AHJDORF044 

12917 atgttacctatttacgtgatgatatgttttataaagaaaacatggaacgttattactacaatccaagcaatttacattttgaca 
1 MLPIYVMICFIKKTWNVITTIQAIYILT 
12833 atgcttactctaaaaattacgtggttgataatgatagatatttatatttag 12783 
29 MLTLKITWLIMIDIYI* 
44AHJDORF149 

770 atgattgttttgaaagtgaatgaatttgtacaccataactatcttcacttttatttgtatcaattgacatgttttcatttaatt 
1 MIVLKVNEFVHHNYLHFYLYQLTCFHLI 
686 ctgttcgtttatttaatcttgaatcttcatatgatgtacccatcatag 639 
29 LFVYLILNLHMMYPS* 
44AHJDORF046 

4891 atgattatccatttaagttatcataccaagacggtattaatttcccacgtgataactttaaagagcctgagggtatt.tg<?a*ttt 
1 MIIHLSYHIKTVLISHVITLKSLR V-» F A F 

4975 atacaaatccaaaaacaaaacgtaaatcgttattacttgctatga 5019 
29 IQ1QKQNVNRYYLL* 
44AHJDORF047 

11911 atgaatgtatgtaagttgttcaggtgtgagttttgcaaaacatttcacagcatagtcataggcttcactatcattcatatcatt 
1 MNVCKLFRCEFCKTFHSIVIGFTIIHII 
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11995 atctttatcaaaaatcgtataattaaaatctgttttaagttgtga 12039 
29 IFIKNRIIKICFKL* 
44AHJDORF045 

10655 atggcaccgtcaaagaattgttcacgtacaaaggtttcaaaatcgacgcttgtatcaaaggcgtttttcggtataccagcagaa 
1 MAPSKNCSRTKVSKSTLVSKAFFGIPAE 
10739 gcaattttaatctttccattcacttcatatgcatatttcttatga 10783 
29 AILIFPFTSYAYFL* 
44AHJDORF048 

15340 atgaggacgttgttgacattatcaatgctggagaagtccaattcacaatttatgaatatgaaaacaaaaaaggtcaaaaaggtt 
1 MRTLLTLSMLEKFNSQPMNMKTKKVKKV 
15256 actcaatcaattttggtcaagtatcattttaatacaaettcatag 15212 
29 TQSILVKYHFNTIS* 
44AHJDORF049 

5784 atgagggggcaggtgttgactttgatggtgcatatggatttcaatgtatggacttatcagttgcttatgtgtattacattactg 
1 MRGQVLTLMVHMDFNVWTYQLLMCITLL 
5868 acggtaaagttcgcatgtggggtaatgctaaagacgcgataa 5909 
29 TVKFACGVMLKTR* 
44AHJDORF050 

13158 gtgtgttacgcttttcattcacgtaatcgtttcgtcgcatttctaaaaaaatgtttttgtaaagtcttgatgtattcattttat 
1 VCYVFHSRNRFVAFLKKCFCKVLMYSFY 
13242 gcttttgtaataaattgtatatatttaaattggataatacag 13283 
29 AFVINCIYLNWII* 
44AHJDORF051 

11066 atgataacaatgaactatacaatatcattaacggttacaaaaacactgaacgtaatatattattctctacatttgtcacatcac 
1 MITMNYTIS LTVTKTLNVIYYSLHLSHH 

10982 gttcattgtataacttattggttcccttccaatacttaa 10944 
29 VHCITYWFLSNT* 
44AHJDORF052 

14338 atgattttagtaatgttaattttaaatttgatgataaagatttacaagaggcgtacattgacacatggaaacattttgcacatc 
1 MIIjVMLIIiNLMIKIYKRRTLTHGNILHI 
14254 tgccctattttcctaaagaaagaaacgtatcatatgtaa 14216 
29 CPIFLKKETYHM* 
44AHJBORF053 

3348 atgtggtttattcatcaagtgaagttgaaaaatacttacaatcacaaggcttcacagaacacaatgaagatacaacaagtaaca 
1 MWFIHQVKLKNTYNHKASQNTMKIQQVT 
3432 ctgatgaaacatcgaatcaaaatgctacatctttag 3467 
29 LMKHRIKMLHL* 
44AHJDORF054 

7551 atgactggaatggaaatacgatgttactcgacgctggtaagatttcacaaaaaactggtgttaagttacgtacaaaatcaatta 
1 MTGMEIRCYSTLVRFHKKLVLSYVQNQL 
7635 ttggttatcataatgaagttcgagtatatccagtag 7670 
29 LVIIMKFEYIQ* 

44AHJDORF055 

15705 atgtgtctggtaataattcttttgcttgtgttttggttaaatgatactcgtgaagtggtaaaaattcctcaatgtattcattat 
1 MCLVIILLLVFWLNDTREVVKI PQCIHY 

15789 catcatctaagtaatgaagtatataacctttga 15821 
29 HHLSKEVYNL* 
44AHJDORF056 

5512 gtgagtattacattacaggtaaccaaatggaattatttagagacgcgccagaagaaattaaaaaagtgggtgcatggttacgtg 
1 VSITLQVTKWNYLETRQKKLKKWVHGYV 
5596 tgtcaagcggtaacgcagtcggtgaagtaa 5625 
29 CQVVTQSVK* 
44AHJDORF057 

10121 atgtaccaccatatgttgcaccatcacgtgtggtattgtacacactcatcaatggcgtaccaaacaatgctggtgataatattg 
1 MYHHMLHHHVWYCIHSLMAYQI MLVI I 'L 

10205 tattctttagtggtattgcttaattaa 10231 
29 YSLVVLLN* 
44AHJDORF058 

10767 atgcatatttcttatgattcagtacaaacatctcatctatctgttcgttttcaatatcccatttacctaaggctatcgggtcga 
1 MHISYDSVQTSYLSVRFQYPIYLRLSGR 
10851 ataaactggggttcaataagggtttaa 10877 
29 INWGSIRV* 
44AHJDORF164 

702 atgttttcatttaattctgttcgtttatttaatcttgaatcttcatatgatgtacccatcatagaacgcatgttgtttccctca 
1 MFSFNSVRLFNLESSYDVPI I ERMLFPS 

618 tacatgtttaaattcctcctaatctaa 592 
29 YMFKFLLI * 

44AHJDORF059 ._ . .. 

8360 atggattttgtaacattggattacctgaaccgtcattacgccaaaatcccacaccagattctaaaattgcttttaattfttcca 
1 MDFVTLDYLNRHYAKlLHQILKLIj-falVP 
8276 ttaacatggggtcgatgtcacgtatag 8250 
29 LTWGRCHV * 

44AHJDORF060 

6257 atgtaccattttcatttctataatatgtgccgtattggtttcgtttccattttccaaatgtatttactttcgatgtttctaatg 
1 MYHFHFYNMCRIGFVSI FQMYLLLMFLM 
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6173 ctttgctattactacctgaaaatttag 6147 
29 LCYYYLKI* 
44AHJDORF061 

15551 atgtgttttggtgtcttgataaaatatctttcacgtttgtcattttatttctcctcttatttaaattatttgctttctgcaatt 
1 MCFGVLIKYLLRLSFYFSSYLNYLLSAI 
15635 gcgatttgtagtaaatcattgtaa 15658 
29 AICSKSL* 
44AHJDORF062 

428S gtggtattcgcaacgcagttaaccaatctattaatattgataaagaaacaaatcacatgtactctacacaatccgattctcaaa 
1 VVFATQLTNLLILIKKQITCTLHNPILK 
4369 aacctgaaggtttttggataa 4389 
29 NLKVFG* 
44AHJDORF063 

9487 atgcgtcttgtattttttttaataattcttgcatggcttgtttcgctaaagcgagtagtgaactaccactgtcaccactactac 
1 MRLVFFLI ILAWLVLLKRVVNYHCHHYY 

9403 cactgtcagacgaatcactag 9383 
29 HCQTNH* 
44AHJDORF065 

5029 gtggtggaaaacgtttccataatttatatggtttcttccaacttggtgagtatgaacactttgaagcattacgcgcaagaggtt 
1 VVENVSI IYMVSSNLVSMNTLKHYAQEV 

5113 cacaaaactataaattaa 5130 
29 H K T I N * 

44AHJDORF064 

2609 atgacgagtcaatcaatcaacttgtgtccgaaatatataacggtgcaccatttgttaaaatgtcacctatgtttaatgcagatg 
1 MTSQSINLCPKYITVHHLLKCHLCLMQM 
2693 acgatatcattgatttaa 2710 
29 T I S L I * 

44AHJDORF066 

10481 atgatattctttatattgaaagtgacatcggttcattttcacttaacgacttatctccagttgaacgttcagtacataacaaat 
1 MI FFILKVTSVHFHLTTYFQLNVQYITN 

10397 ctgatttgcatatattaa 10380 
29 I* I C I Y * 
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Table 19 



Sequence similarities between ORFs 44AHJD and public databases 



Phage: 44AHJD 



55 
53 
49 
46 
45 
45 
45 
44 
44 
41 
40 
39 



92 
82 
78 
71 
54 
42 



52 
49 
48 
47 
46 



le-06 
6e-06 
le-04 
0.001 
0.002 
0.002 
0.002 
0.004 
0.004 
0.041 
0.070 
0.092 



112 7e-24 
52 le-05 
39 0.10 



Database: nr 

Query= sid| 110871 | lan| 44AHJDORF001 Phage 44AHJD ORF | 10342-12627 | -1 
(761 letters) 

gi|ll884 8|sp|P19894|DPOL_BPM2 DNA POLYMERASE >gi | 76896 |pir| |JQ0 . . 
gij 1072656 |pir| (S51275 DNA polymerase - phage CP-1 >gi| 836593 |e. . 
gij 1429230 j emb) CAA6764 9 1 (X99260) DNA polymerase {Bacteriophage.. 
gi| lS724 79|emb|CAA65712| (X96987) DNA polymerase [Bacteriophage., 
gij 118851 |sp|P06950|DPOL_BPPZA DNA POLYMERASE (EARLY PROTEIN GP. . 
gij 2435429 (AF012250) una □ signed reading frame (possible DNA po. . 
gi| 1084487 jpir| (S41618 DNA polymerase - slime mold (Phyearura po. . 
gij 4877819 | gb| AAD31446 . 1 | (AP133505) DNA polymerase (Neurospora. . 
gi|461962jsp|P33537|DPOM - NEUCR PROBABLE DNA POLYMERASE >gi | 2833 . . 
gi|2499511 |sp|Q1247l|6P22_YEAST 6 - PHOSPHOFRUCTO-2 - KINASE 2 (PHO. . 
gi|2258375|gb|AAD11909.l| (AF007261) transcription initiation f.. 
gij 15734|emb|CAA37450| (X53370) DNA polymerase (AA 1-575) [Bact., 

Query* sid| 110872|lan|44AHJDORF002 Phage 44AHJD ORF| 3789-5732 | 3 
(647 letters) 

gi | 135273 | sp| P27622 |TAGC_BACSU TEICHOIC ACID BIOSYNTHESIS PROTE. . 
gij 142847 (M64050) DNase inhibitor (Bacillus subtilisj 
gij 4038407 (AF103943) factor C protein precursor (Streptorayces 

Query= sid | 110873 | lan | 44 AHJDORF003 Phage 44AHJD ORF | 6626-8389 | 2 
(587 letters) 

gi|l38123|sp|P0433ltVG9_BPPH2 TAIL PROTEIN (LATE PROTEIN GP9) >.. 
gij 138124 | spj P07534 jvG9_BPPZA TAIL PROTEIN (LATE PROTEIN GP9) >.. 
gijl429238|embjCAA67657| (X99260) tail protein (Bacteriophage B. . 
gij 215339 (M12456) p9 tail protein (Bacteriophage phi-29) >gi|2.. 
gij 1181968 | emb | CAA87738.il (Z47794) tail protein [Bacteriophage.. 
gi| 1181970 j emb |CAA87740.lj (Z47794) tail protein [Bacteriophage.. 

Query= sid| 110875 | lan | 44 AHJDORF0 05 Phage 44 AHJD ORF | 12643-13890 | -1 
(415 letters) 

gi[ 3845203 (AE001399) GAF domain protein (cyclic nt signal tran. . . 
gij3758843|emb|CAB11128.l| (298551) predicted using hexExon; MA... 
gij 3845297 (AE001421) hypothetical protein (Plasmodium falciparum) 
gi j 4493936 | emb |CAB38972.l| (AL034556) predicted using hexExon; ... 
gij 3845165 (AE001390) hypothetical protein (Plasmodium falciparum} 

Query« sid) 110877 | lan| 44AHJDORF007 Phage 44AHJD ORF | 2044-3027 | 1 
(327 letters) 

gi|H81960|emb|CAA87731.l| (Z47794) connector protein [Bacterio. . . 46 Se-04 

gij 14 29239 j emb jcAA676 58 | (X99260) upper collar protein (Bacteri... 45 8e-04 

gi jl37915|sp|P07535|VG10_BPPZA UPPER COLLAR PROTEIN (CONNECTOR ... 44 0.002 

gi j 137914 j spj P04332 | VG10_BPPH2 UPPER COLLAR PROTEIN (CONNECTOR ... 41 0.009 

Query= sid ( 110878 | lan j 44AHJDORF0 08 Phage 44AHJD ORF| 3020-3775 | 2 
(251 letters) 

gi (4982468 |gb| AAD30963. 2 | (AF118151) SNF1/AMP -activated kinase ... 52 3e-06 

gij 1730077 jsp|P18160|KYKl_DICDI NON-RECEPTOR TYROSINE KINASE SP... 46 2e-04 

gi j 3758855 j emb | CAB11140 . lj (Z98551) predicted using hexExon; MA... 46 2e-04 

gi j 585795 | spj P21538 |REB1_YEAST DNA-BINDING PROTEIN REB1 (QBP) >... 46 3e-04 

gij 172372 (M5B728) DNA-binding protein [Saccharorayces cerevisiae] 46 3e^04 

gi|2952545 (AF051898) coronin binding protein (Dictyostelium di. . . 45 6e-04 _ 

gij535260|emb|CAAB2996| (Z30339) STARP antigen (Plasmodium reic .. . 45 7e-04 

gijl429240|emb|CAA67659| (X99260) lower collar protein (Bacteri... 44 0.001 



8e-l8 
le-14 
2e-13 
2e-ll 
3e-06 
0.010 



6e-06 
5e-05 
le-04 
2e-04 
6e-04 
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Query* 



aid j 110879) lan | 44AHJDORF009 Phage 44AHJD ORF| 5744 -6496 | 2 
(250 letters) 



gi|276498l|etnb|CAA69021.l| (Y07739) N-acetylrouramoyl-L-alanine ... 180 le-44 

gi j 113675 | sp| P24556 | ALYS_STAAU AUTOLYSIN (N-ACETYLMURAMOYL-L-AL. . . 118 6e-26 

gij 1763243 (U72397) amidase {bacteriophage 80 alpha] 118 6e-26 

gi|4574237|gb|AAD23962.l|AF106851_l (AF106851) LytN (Staphyloco. . . 84 9e-16 

gi|3767S93|dbj|BAA338S6.l| (AB015195) LytN t Staphylococcus aureus! 84 9e-16 

gi|2764983 j embjcAA69022 . 1 j (Y07740) cell wall hydrolase Plyl87 ... 77 2e-13 

gij 3287732 | sp| 005156 | ALE1_STACP GLYCYL- GLYC INE ENDOPEPTIDASE AL. . . 73 2e-12 

gi|79926|pir| |A25881 lyaostaphin precursor - Staphylococcus sim. . . 69 3e-ll 

gij 1264 96 | sp | P1054 8 |LSTP_STAST LYSOSTAPHIN PRECURSOR (GLYCYL-GL. . . 69 3e-ll 

gij 3287967 |sp|P1054 7 |LSTP_STASI LYSOSTAPHIN PRECURSOR (GLYCYL-G... 69 3e-ll 

gi|3341932 | dbj (BAA31898.lt (AB009866) amidase (peptidoglycan hy. . . 68 6e-ll 

Query= sid| 110882 | lan \ 44AHJDORF012 Phage 44AHJD ORF| 8391-B813 | 3 
(140 letters) 



gi|l40528|9p|P2481l|YQXH_BACSU HYPOTHETICAL 15.7 KD PROTEIN IN . 
gij 4126631 | dbj |BAA36651.l( (AB016282) ORF45 (bacteriophage phi-, 
gij 141088 |sp|P26835|YNGD_CLOPE HYPOTHETICAL 14.9 KD PROTEIN IN . 
gi|2293160 (AF008220) YtkC (Bacillus subtilis) >gi | 2635548 (erabj . 
gi | 1181973 |emb|CAA87743.l| (Z47794) holin protein [Bacteriophag . 



B0 
76 
61 
36 
31 



6e-15 
le-13 
4e-09 
0.099 
3.3 
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Table 20 

Homolgies between phage 44 AHJD ORFs and proteins in public databases 



Query« pt | 110871 44AHJDORF001 Phage 4 4 AHJD ORF | 10342-12627 | -1 1 
(761 letters) 

>gi|ll884 8|sp|P19894|DPOL_BPM2 DNA POLYMERASE >gi| 76896 |pir| |JQ0161 
DNA-directed DNA polymerase (EC 2.7.7.7) - phage M2 
>gi | 215509 (M33144) DMA polymerase [Bacteriophage M2] 
Length «= 572 

Score « 55.4 bits (131), Expect = le-06 

Identities = 96/426 (22%) , Positives » 159/426 (36%) , Gaps » 88/426 (20%) 

KLTPEQLTYIHNDVIILGMCHIHYSDIFPNFDYNK^ FQ 283 

++TPE+ YI ND+ 1+ DI +++T + ++ + + T+ F 



L+ D +1 + YRGG N KY K I E D+NS YP 

KLSLPMDKEI RKAYRGGFTWLNDKYKEKE IGEGMV- FDVNSLYP 252 



MY+P YP+ +D+LYI+F+L K+ + 
3MYSRPLP YGAPIVFQGKYEKDEQYPLY- IQRIRFEFEL KEGY1PTI 299 



++ +T +D 1+ + + +Y EY F + 

PVELYLTNVDLEL1QEH-YELYNVEYIDGFK FRE 352 

DDINEH PYSNEEVMLS KWLNGLYG I PAL 511 

+ L+K++LN LYG +P L 
EEGAKKQLAKLMLNSLYGKFASNPDVTG KVPYL 403 



Query: 


229 


Sbjct: 


154 


Query: 


284 


Sbjct: 


210 


Query: 


344 


Sbjct: 


253 


Query: 


404 


Sbjct: 


300 


Query: 


463 


Sbjct: 


353 


Query: 


512 


Sbjct: 


404 


Query: 


571 


Sbjct: 


450 


Query: 


626 


Sbjct: 


509 



YK+ + F+T+ + + + Q D 
- YKDPVYT PM-GVF I TAWARFTT I TAAQACY DRI 449 



IYCDTDSLYMKSWKPLLNPSLFDPIALGKWDIENEQIDKMFVLNHKK YAYEVNG 625 

IYCDTDS+++ P + +DPLGWE+ + LK Y EV+G 



>gi| 1072656 |pir| |S51275 DNA polymerase - phage CP-1 

>gi|836593|emb|CAAB7725.l| (Z47794) DNA polymerase 
(Bacteriophage CP-1) 
Length = 568 

Score = 53.5 bits (126), Expect = 6e-06 

Identities <* 104/464 (22%), Positives » 169/464 (36%), Gaps = 66/464 (14%) 



+ PE + YIH DV IL G+ ++Y + F Y + +L + +F+ 



D K+ D+ + G + K+ + +++ DINS YP M 

LDEKVD DFCRKHIVGAGRLPTLKHRGRTLNQLIDIYDINSMYPATML 257 



Query: 


230 


Sbjct: 


152 


Query: 


288 


Sbjct: 


210 


Query: 


348 


Sbjct: 


258 


Query: 


406 


Sbjct: 


313 



+P + + Y P + +D+Y+ + K D D+ 
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Query: 466 KLKN KINMTS PYDYHI TDD INEH PYSNEEVMLS KWLNGLYG I P ALR - - SH FNLFRLDDN S23 

+ Y Y E+ S E +K++LN LYG + S L LDD 

Sbjct: 367 TTYRYK KENAQSPAEKQKAKIMLNSLYGKFGAKI ISVKKLAYLDDK 412 

Query: 524 NE LYN I ING YKNTERN I L F S T FVT S RS L YNL L VP FQYLTES E IDDNF I YCDTD S 577 

L +KN + + + FVTS + + ++ Q E DNF+Y DTDS 

Sbjct: 413 GILR FKNDDEEEVQPVYAPVALFVTSIARHFIISNAQ ENYDNFLYADTDS 462 

Query: 578 LYMKSWKPLLNPSLFDPIAI/3KWDIENEQIDKWFVLNHKKYAYEVNGKIKIAS 637 

L++ +L+ DP GKW E +K LKYE++ +K 
Sbjct: 463 LHLFHSDSLVLD IDPSEFGKWAHEGRAV-KAKYLRSKLYTEELIQEDGTTHLDV-KG 517 

Query: 638 AFDTSVDFETFVREQFFDGAI I ENNKS I YNEQGT I S I YPSKTE I 681 

AT E EFGAE++ +G I Y + +1 
Sbjct: 518 AGMTPEIKEKITFEKFVIGATFEGKRASKQIKGGTLIYETTFKI 561 

>gi|l429230|emb|CAA67649| (X99260) DNA polymerase (Bacteriophage 
B103] 

Length » 572 
Score » 49.2 bits (115), Expect a le-04 

Identities = 93/422 (22%), Positives = 155/422 (36%), Gaps « 88/422 (20%) 

Query: 229 KLTPEQLTYIHNDVIIUSMCHIHYSDIFPNFDYNKLTFSLNIMESYIiNNEMrR FQ 283 

++TPE+ YI ND+ 1+ DI +++T + ++ + + T+ F 

Sbjct: 154 EITPEEYEYIKNDIEIIARA LDIQFKQGLDRMTAGSDSLKGFKDILSTKKFNKVFP 209 

Query: 284 LLNQYQDIKISYTHYHFHDMNFYDYIKSFYRGGUJMYNTKYINKLIDEPCFSIDINSSYP 343 

L+ D +1 + YRGG N KY K 1 E D+NS YP 

Sbjct: 210 KLSLPMDKEI RRA YRGG FTWLND KYKEKE I G EGMV - FDVNSLYP 252 

Query: 344 YVMYHEKIPTWLYFYEHYSEPTLIPTFLDDDNYFSLYKIDKDVFNDDLLIKIKSRVTA 403 

MY +P YP+ +D+LY1+F+L K+ + 
Sbjct: 253 SQMYSRPLP YGAPIVFQGKYEKDEQYPLY- IQRIRFEFEL KEGYIPTI 299 

Query: 404 XXXXXXXXXXXXXXXXXXLRMIQ-DITGIDCMHIRVNSFVIYECEYFHARDIIFQNYFIK 462 

++ +T +D 1+ + + +Y EY F + 

Sbjct: 300 Q I KKN P FFKGNEYLKNSGAE P VEL Y LTNVD LEL I Q EH - YEMYNVEY I DG F K FRE 3S2 

Query: 463 TQGKLKNKINMTS PYDYHI TOD INEHPYSNEEVMLSKVVLNGLYG IPAL 511 

G K 1+ + H + L+K++ + LYG +P 1* 

Sbjct: 353 KTGLFKEFIDKWTYVKTH- EKGAKKQLAKLMFDS LYGKFASNPDVTGKVP YL 403 

Query: 512 RSHFNL-FRLDDNNELYNIINGYKNTERNILFSTFVTSRSLYNLLVPFQYLTESEIDDNF 570 

+ +L FR+ D YK+ + P+T+ + + + Q D 
Sbjct: 404 KEDGSLGFRVGDEE YKDPVYTPM-GVFITAWARFTTITAAQACY DRI 449 

Query: 571 IYCDTDSLYMKSVVKPLLNPSLFDPIALGKWDIENEQIDKMFVLNHKK YAYEVNG 625 

IYCDTDS+++ P + + DP LG W E+ + L K YA EV+G 

Sbjct: 450 IYCDTOSIHLTGTEVPEIIKDIVDPKKLGYWAHES-TFKRAKYLRQKTYIQDI 508 

Query: 626 KI 627 
K+ 

Sbjct: S09 KL 510 

>gi|l572479|emb|CAA65712| (X96987) DNA polymerase (Bacteriophage 
GA-1) 

Length « 578 
Score « 46.1 bits (107), Expect = 0.001 

Identities = 80/376 (21%), Positives » 146/376 (38%), Gaps « 54/376 (14%) 

Query: 234 QLTYIHNDVIILGMCHIHYSDIFPNFDYNKLTFSLNIMESYLNNEMTRFQLLNQYQDIKI 293 

++ Y+ +D++I+ + +F N D+ +T + + +Y EM + +Y + 

Sbjct: 162 EIEYLKHDLLIVALA LRSMFDH-DFTSMTVGSDALNTY- - KEMLGVKQWEKYFPVL- 214 

Query: 294 SYTHYHFHDMNFTOYIKSFYRGGLNMYNTKYIN1U,IDEPCFSIDINSS 353 

+ 1+ Y+GG N KY + + D+NS YP +M ++ +P 

Sbjct: 215 -SLK^SEIRKAYKGGFTWVNPKYG^ETVYGGMV-FDVNSMYPAMMKNKLLP- 264 

Query: 354 WLYFYEHYSEPTLIPTFIiDDDNYFSLYKIDKDVFNDDLLIKIKSRVLRQMXXXXXXXXXX 413 
Y EP + + + LY F + KI ++ 
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SbjCt: 


265 


Query: 


414 


Sb j ct : 


318 


Query: 


474 


Sbjct: 


360 


Query: 


S32 


SbjCt: 


420 


Query: 


589 


Sbjct: 


473 



286 

2S5 YGEPVMFKGEYKKNVEYPLYIQQVRCFFELKKDKI PC1QI KGNARFGQNEYLS 317 



+T +D 1+ + + I+E E+ 



S B+ + +K++LN LYG A 



+ LD+N L 



K ER+ +++ 



F+T+ 



N+L Q L 



FIY DTDS++++ + + 
-RFIYADTDSIHVEGLGEVDA 472 



+ DP LG WD E 



>gi|H885l|sp|P06950|DPOL_BPPZA DNA POLYMERASE (EARLY PROTEIN GP2) 
>gi | 75812 |pir| |ERBP2Z DNA-directed DNA polymerase (EC 
2.7.7.7) - phage PZA >gi| 216051 (M11B13) gene 2 product 

[Bacteriophage PZA] >gi| 224741 |prf | (1112171E ORF 2 

(Bacteriophage PZAJ 
Length « 572 

Score = 45.3 bits (105), Expect 0.002 

Identities = 98/461 (21%), Positives * 166/461 (35%), Gaps = 110/461 (23%) 

Query- 198 Q LKTD FNYT I FDKDNDMND S EA YD YAVKCF AKLT PEQLTY I HNDVI I LGMCH I HY S D I F P 257 

++ DF T+ D D + Y ++TP++ YI ND+ 1+ + I 
Sbjct: 129 KIAKDFKLTVLKGDIDYHKERPVGY EITPDEYAYIKNDIQIIAEALL IQF 178 

Query: 258 NFDYNKLTFSLNIMESYLNNEMTR FQLLNQYQD I KI SYTHYHFHDMNFYDYI KS F 312 

+++T + ++ + + T+ F L+ D ++ Y 
SbjCt: 179 KQGLDRMTAGSDDLKGFKDIITTKKFKKVFPTLSLGLDKEVRYA 222 

Query: 313 YRGGL^JMYNTKYINKLIDEPCFSIDINSSYPYVMYHEKIPT 370 

YRGG N ++ K I E D+NS YP MY +P Y EP + 

SbjCt: 223 YRGGFTWLNDRFKEKEIGEGMV- FDVNSLYPAQMYSRLLP YGE P I VFEGKYV 273 

Query: 371 LDDDNYFSLYKID KDVFNDDLLIKIKSRVLRQMXXXXXXXXXXXXXXXXXXLRMI 425 

D+D +1 K+ + + IK +SR + 
Sbjct: 274 WDEDYPLH I QHI RCEFELKEG YI PT IQ I K - RSRFYKGNE YLKS SGGE I AD LW 324 

Query: 426 QDI TGIDCMH I RVNS F VI YECE YFHARD 1 1 FQNYFI KTQGKLKNKINMTS P YDYH I TDD I 485 

++ +D + + + +Y EY F T G K+ 1+ + I 

Sbjct: 325 - -VSNVD-LELMKEHYDLYNVEYISGLK FKATTGLFKDFIDKWTHIKTTSEGAI 375 

Query: 486 NEHPYSNEEVMLSKWLNGLYG IPALRSHFNL- FRLDDNNELYNIINGY 533 

+ L+K++LN LYG +P L+ + L FRL G 
SbjCt: 376 KQ LAKLMLNSLYGKFASNPDVTGKVPYLKENGALGFRL GE 415 

Query: 534 KNTERNIL- - FSTEVTSRSLYNLLVPFQYLTESEIDDNFIYCDTDSLYMKSWKPLLNPS 591 

+ T+ + F+T+ + Y + Q D IYCDTDS+++ P + 

Sbjct: 416 EETKDP VYTPMGVF I TAWARYTT ITAAQAC F DRI I YCDTDS I HLTGTE I PDVI KD 470 

Query: 592 LFDPIALGKWDIENEQIDKMFVLNHKKYAY EVNGKI 627 

+ DPLGWE+ + LKY EV+GK+ 
Sbjct: 471 IVDPKKLGYWAHES -TFKRAKYLRQKTYIQDI YMKEVDGKL 510 



>gi | 2435429 (AFO12250) unasaigned reading frame (possible DNA 
polymerase) [Physarum polycephaluml 
Length = 544 

Score o 44.9 bits (104), Expect = 0.002 

Identities = 118/545 (21%) , Positives = 206/545 (37%) , Gaps = 



104/545 (19%) 



Query: 179 TSIATLGKKLLDGGYLTESQLKTDFNYTIFDKDNDMNDSEAYDYAVKCFAKLTPEQLTYI 238 

T+LKLD+TQ F NM Y +CFLP++ I 

Sbjct: 62 TQLFNLLKSLQDSSFYTFKQ FTYQNIM YSLEISCF- -LYPKKKILI 105 

Query: 239 HNDVI ILGMCHIHYSDIFPNFD YNKL- -TFSLNIMESY-LNNEMTRFQLLNQYQD 290 

D+ +1 Y+D+ ++ YN++ +++NI Y L+ ++ + 

SbjCt: 106 -KDLYNFFSENIIYNDVVKDYKLLAILYNEIQTAYNININRKYILSTASLSLRIFKKSFP 164 
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Query: 291 IKISYTHYHFHDMNFYDYIKSFYRGGLNMYNTKYINKLIDEPCFSIDINSSYPYVW 350 

K + D + +YI+ Y GG N I + + + + D+NS YPY+M EK 

Sbjct: 165 EKYRLIPHLTRDED- -NYIRKSYIGGRNE IFEHVAQRNYFYDVNSLYPYIMKKEK 217 

Query: 351 IPTWLYFYEHYSEPTLIPTFLDD-DNYFS LYKIDKDVFNDDLL- - -IKIKSRVLRQ 402 

+p + Y + + F + +N+F L I+K N +L + IK+ V 

Sbjct: 218 MPIGI PEYRDKEYMKKFEKNIENFFGFIDVLITIEKTNNNI PVLPYRMGIKNNV- EV 273 

Query: 403 MXXXXXXXXXXXXXXXXXXL^IQDITGIDCMHIRVNSFVIYECEYFHARDIIFQNYFIK 462 

L + Q 1+ IY + ++++F+ Y + 

Sbjct: 274 G 1 I YAKGTLRG I YFSEEI KLALKQG YKI I E IYSAYEYKEKEWFEEYVEQ 323 

Query: 463 TQGK-LKNKINMTSPYDYHITDDINEHPYSNEEVMLSKWLNGLYG IPALRS 513 

+ LK K D+D LK +LN LYG I + 

Sbjct: 324 MYNRRLKAK DPALKD LYKKLLNT L YGRFGLVY EQ I D 1 1 S P 363 

Query: 514 HFNLFRLDDimELYNIINGYKNTERNILFSTFVTSRSLYNLLVPFQYLTESEIDDNFIYC S73 

L + DN + + + + N ++ + ++ + FYT + + IY 
SbjCt: 364 EKEL- - ITDNTYISHDTTEFIDITANTCYNNIAITSAITSYARIFMYNTII^TYNLiHVIYI 421 

Query: 574 DTDSLYMKSVVKPLI^PSLFDPIAI^KWDIEKEQIDKMFVIjNHKKYAY-EW 632 

DTD L++K+ P+ + +L +GK+ +E+ + F+ N K Y Y +N I 

SbjCt: 422 DTDGLFLKN PIPDIALTTSKEMGKFRLESINAEAHFIAN-KFYIYAPINSPIIYKFK 477 

Query: 633 GIPK NAFDTSVDFETFVR EQFFDGAIIENNKSIYNEQGT ISIYPSK 678 

GIP ND + + +F+INNY+Q+ IY + 

Sbjct: 478 GIPLQKPIFNIHDIITQHKKILt^ITIX3HHYFTFSIRLNNNQTYSFQA5RKRKLIPNYKTT 537 

Query: 679 TEIVC 683 
I+C 

SbjCt: 538 PWIIC 542 



>gi|l084487|pir| |S41618 DNA polymerase - slime mold (Physarum 

polycephalum) >gi| 509721 j dbj |BAA06121.l| (D29637) DNA 
polymerase (Physarum polycepnaluml 
Length « 547 

Score ■ 44.9 bits (104), Expect = 0.002 

Identities = 118/54S (21%), Positives « 206/545 (37%), Gaps » 104/545 (19%) 

Query: 179 TSIATI^KKLLDGraYLTESQLKTDFlTYTIFDKDNDMNDSEAYDYAVXCFAKLTPEQLTYI 238 

T+LKLD+TQ F NM Y +CFL P++ I 

Sbjct: 65 TQLFNLLKSLQDSSFYTFKQ FTYQNIM YSLEISCF--LYPKKKILI 108 

Query: 239 HNDVI ILGMCHIHYSDIFPNFD YNKL--TFSLNIMESY-LNNEMTRFQLLNQYQD 290 

D+ +1 Y+D+ ++ YN++ +++NI Y L+ ++ + 

Sbjct: 109 -KDLYNFFSENIIYNDWKDYKKLAILYNEIQTAYNININRKYILSTASLSLRIFKKSFP 167 

Query: 291 IKISYTHYHFHDMt^YDYIKSFYRGGLNMYNTKYINKLIDEPCFSIDINSSYPYVMYHEK 350 

K + D + +YI + YGGN I + + + + D+NS YPY+M EK 

Sbjct: 168 EKYRLIPHLTRDED- -NYIRKSYIGGRNE IFEHVAQRNYFYDVNSLYPYIMKKEK 220 

Query: 351 I PTWLYFYEHYSEPTLI PTFLDD - DNYFS LYKIDKDVFNDDLL IKIKSRVLRQ 402 

+p + Y + + F + +N+F L I+K N +L + IK+ V 

Sbjct: 221 MPIGI pEYRDKEYMKKFEKNIENFFGFIDVLITIEKTNNNIPVLPYRMGIKNNV-EV 276 

Query: 403 MXXXXXXXXXXXXXXXXXXLRMIQDITGIDCMHIRVNSFVIYECEYFHARDIIFQNYFIK 462 

L + Q 1+ IY + ++++F+ Y + 

Sbjct: 277 G 1 1 YAKGTLRG I YFS EE I KLALKQGYKI I E IYSAYEYKEKEWFEEYVEQ 326 



Query: 463 TQGK-LKNKINMTSPYDYHITDDINEHPYSNEEVMLSKVVLNGLYG- IPALRS 513 

+LKK D+D LK +LN LYG I + 

Sbjct: 327 MYNRRLKAK DPALKD LYKKLLNTLYGRFGLVYEQIDIISP 366 

Query: 514 HFNLFRLDDNNELYNIINGYKNTERNILFSTFVTSRSLYNLLVPFQYLTESEIDDNFIYC 573 

L+DN++ ++ N++ ++++ FYT ++IY 
Sbjct: 367 EKEL--ITDNTYISHDTTEFIDITANTCYNNIAITSAITSYARIFMYNTILNYNLHVIYI 424 

Query: 574 DTDSLYMKSVVKPLI^PSLFDPIAI^KWDIENEQIDKMFVLNHKKYAY-EVNGKIKIASA 632 

DTD L++K+ P+ + +L +GK+ +E+ + F+ N K Y Y +N I 

Sbjct: 425 DTDGLFLKN- - -PIPDIALTTSKEMGKFRLESINAEAHFIAN-KFYIYAPINSPIIYKFK 480 

Query: 633 GIPK NAFDTSVDFETFVR EQFFDGAIIENNKSIYNEQGT ISIYPSK 678 
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GIP N D ++ +F+INNY+Q+ IY + 

Sbjct: 481 GIPLQKPIFNlKDIITQHKKILNITI^HHYFTFSIRLmiNQTYSFQASRKRKLIPNYKTT 540 

Query: 679 TEIVC 683 
I+C 

Sbjct: 541 PWIIC 545 

>gi|4 877819lgb|AAD31446.l| (AF133505) DNA polymerase [Neurospora 
craasa] 
Length = 103S 

Score = 44.1 bits (102), Expect « 0.004 

Identities » 36/172 (20%), Positives = 82/172 (46%) , Gaps « 14/172 (8%) 

Query: 521 DDNNELYNI INGYKNTERNILFSTFVTSRSLYNLLVPFQYLTBSEIDDNPIYCDTDSLYM 580 

+ N EL + ++G K+ I ++ ++++++ ++++ S Y DTDS+++ 

Sbjct: 817 EKNYELLSYLDGEKDDGFI INSTS I AAATASWSRI LMYKHI INSA YTDTDSIFV 870 

Query: 581 KS WKPLLNPSLFDPIALGKWDI ENEQIDKMFVLNHKKYAYEVNGKIKIASAG I PKNAFD 640 

+ KPL +++ K+ +1+ ++KY + GK++I GI KN + 

Sbjct: 871 E KPLDS AFIGEGCGKFKAEYNGQLI KRAI F I SGKLYLLDFGGKLEIKCKG ITKNKDN 927 

Query: 641 TSVDFETFVREQFFDG AIIENHKSIYNEQGTISIYPSKTEIVCGNVYDE 689 

T+ + + E ++G + + E GT+++ K ++ G YD+ 

Sbjct: 928 TTHNIiDINDFEALYNGESRVLFQERWGRSLELGTVTVKYQKYNLISG--YDK 977 



>gi| 461962 |sp|P33537|DPOM_NEUCR PROBABLE DNA POLYMERASE 

>gi}28335l|pir| |S26985 probable DNA-directed DNA 
polymerase (EC 2.7.7.7) - Neurospora crassa 
mitochondrion plasraid tnaranhar (SGC3) 
>gi|578156|emb|CAA39046| (X55361) putative DNA 
polymerase (Neurospora crassa] 
Length « 1021 

Score « 44.1 bits (102), Expect a 0.004 

Identities « 36/172 (20%), Positives = 82/172 (46%), Gaps » 14/172 (8%) 

Query: 521 DDNNELYNIINGYKNTERNILFSTFVTSRSLYNLLVPFQYLTESEIDDNFIYCDTDSLYM 580 

+ N EL + ++G K+ 1 ++ + + ++ ++ ++++ S Y DTDS+++ 

Sbjct: 815 EKNYELLSYLDGEKDDGFIINSTSIAAATASWSRILMYKHIINSA YTDTDSIFV 868 

Query: 581 KSVVKPI^PSLFDPIAl^KWDIENEQIDKMFVI^KKYAYEVNGKIKIASAGIPKNAFD 640 

+ KPL + ++ K+ +1+ ++KY + GK++I GI KN + 

SbjCt: 869 E KPLDSAPIGEGCGKFKAEYNGQLIKRAIFISGKLYLLDFGGKLEIKCKGITKNKDN 925 

Query: 641 TSVDFETFVREQFFDG AIIENNKSIYNEQGTISIYPSKTEIVCGNVYDE 689 

T+ + + B ++G + + E GT+++ K ++ G YD+ 

SbjCt: 926 TTHNLDINDFEALYNGESRVLFQERWGRSLEIiGTVTVKYQKYNLISG- -YDK 975 



>gi|249951l|sp|Q1247l|6P22_YEAST 6 - PHOS PHOFRUCTO - 2 - KINASE 2 
(PHOSPHOFRUCTOKINASE 2 II) (6PF-2-K 2) 
>gi|2131162 |pir| |S61066 6-phosphof ructo-2 -kinase (EC 
2.7.1.105) - yeast (Saccharomyces cerevisiae) 
>gi|2131163|pir| |S71026 6-phosphof ructo-2-kinase (EC 
2.7.1.105) - yeast (Saccharomyces cerevisiae) 
>gi|l085116|emb|CAA6237l| (X90861) 
6-phosphofructo-2-kinase (Saccharomyces cerevisiae) 
>gi|l420028|emb|CAA991S7| (Z74878) ORF YOL136C 
[SaccharomyceB cerevisiae) >gi| 1628439 |emb|CAA64733 | 
(X95465) 6-phosphofructo- 2 -kinase [Saccharomyces 
cerevisiae) 
Length » 397 

Score = 40.6 bits (93), Expect = 0.041 

Identities = 48/208 (23%), Positives = 92/208 (44%), Gaps » 29/208 (13%) 

Query: 175 MKTNTSIATLGKKLLDGGYLTESQLKTDFNYTIFDKDNDM^ 234 

++ S AT+ K LL L+ + + FN K+ND ++ +A++T ++ 

SbjCt: 139 I RRQI SCAT I SKPLL LSNTSSEDLFN PKNNDKKET YARITLQK 181 

Query: 235 LTY - 1 HNDVI I LGMCHIKYSD I F PNFDYNKLTFS LNI ME SYLNNEMTRFQLLN QYQD 290 

L + I+ND +G+ SI + F + S+ +E++ F L+ Q 

SbjCt: 182 LFHEINNDECDVGIFDATNSTI ERRRFIFEEVCSFNTDELSSFNLVPIILQVSC 235 
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Query: 291 IKISYTHYHFHDMNFY-DYIKSFYRGGLNTfYNTKYINKLIDEPCFSID- INSSYPYVMYH 34 8 

S+ Y+ H+ +F DY+ Y + + + + FS+D N + Y+ H 

Sbjct: 236 FNRSFIKYNIHNKSFNEDYLDKPYELAIKDFAKRLKHYYSQFTPFSLDEFNQIHRYISQH 295 

Query: 349 EKI PTWLYFYEHYSEPTLI PTFLDDDNY 376 

E+I T L+F+ + + P L+ +Y 
Sbjct: 296 EEIDTSLFFFNVINAGWEPHSLNQSHY 323 



>gij 2258375 jgb | AAD11909. l) (AF007261) transcription initiation 
factor sigma [Reclinomonas am eric an a] 
Length - 532 

Score » 39.9 bits (91), Expect = 0.070 

Identities = 49/205 <23%), Positives = 84/205 (40%), Gaps = 14/205 (6%) 

Query: 100 NHFLIiKDTMRYFDNITRENI YLKSAEENEHTLKMKEATI LAKNQNVIL EKRVKSS IN 156 

N+ + + F + ++IY+ + +KE L K NVI+ K +K N 

Sbjct: 177 *TYLVKNSYLNLFKTVPHDSIYMNYSYIQTPI2JIL 236 

Query: 157 LDLTMFLNGFKFNI IDNFM KTNTSIATLGKKLLDGGYLTESQLKTDFNYTIFDKDND 213 

L++++FL F + N++ K + + + K L Y+T L T Y K 
Sbjct: 237 UJISIJLYKFYQELKWNYIFINKISRNTQKINIKTLKNSYITFYNLITFIQYYTTKKQRL 296 

Query: 214 MNDSEAYDYAVKCFAK- - LTPEQLTYIHNDVI I LGMCHIHYSDI FPNFDYN - KLTFSLNI 270 

D +K F K P+ +N +1 G+ HI+ + N K+T I 

Sbjct: 297 KKDIFYKQIFIKTFLKQHKIPKINKIKNNSLIKYGLTHIYDMILISILRENIKVTLKNRI 356 

Query: 271 MES YLNNEMTRFQLLNQYQD I KI S Y 295 

+ +Y+ T + QY +KI Y 
Sbjct: 357 IFNYMPYITT ISKQY--VKIGY 376 



>gi| 15734 |erab|CAA37450| (X53370) DNA polymerase (AA 1-57S) 
[Bacteriophage phi -2 9] 
Length = 575 

Score = 39.5 bits (90), Expect = 0.092 

Identities » 41/150 (27%), Positives = 64/150 (42%), Gaps = 36/150 (24%) 

Query: 497 LSKWLNGLYG I PALRSHFNL - FRLDDNNELYN I ING YKNTERN I L - - F 542 

L+K++LN LYG +P L+ + L FRL G + T+ + 

Sbjct: 381 LAKLMLNSLYGKFASNPDVTGKVPYLKENGALGFRL GEEETKDPVYTPM 429 

Query: 543 STFVTSRSLYNLLVPFQYLTESEIDDNFIYCDTDSLYMKSWKPLLNPSLFDPIALGKWD 602 

F+T+ + Y + Q D IYCDTDS+++ P + + DP LG W 

Sbjct: 430 GVF ITAWARYTT I TAAQACY DRI I YCDTD S I HLTGT E I PD VI KDI VDPKKLG YWA 484 

Query: 603 IENEQIDKMFVLNHKKYAY EVNGKI 627 

E+ ++ L K Y EV+GK+ 
Sbjct: 485 HES-TFKRVKYLRQKTYIQDIYMKEVDGKL 513 



Query* pt| 110872 44AHJDORF002 Phage 44AHJD ORF (3789-5732} 3 1 
(647 letters) 

>gi|l35273|sp|P27622 |TAGC_BACSU TEICHOIC ACID BIOSYNTHESIS PROTEIN C 
>gi|478126|pir| |D49757 techoic acid biosynthesis protein 
tagC - Bacillus subtilis (strain. 168) >gi| 143727 
(M57497) putative (Bacillus subtilis) 
>gi| 2636103 |erab|CABl5594.lj (Z99122) alternate gene 
name: dinC (Bacillus subtilis] 
Length = 442 

Score ■ 112 bits (278), Expect * 7e-24 

Identities = 91/314 (28%), Positives = 147/314 (45%), Gaps = 58/314 (18%) 

Query: 152 FELNELEPKFVMGFGGIRNAVNQSINIDKETNHMYSTQSDS QKP EG FW INKLT PS G 207 

F+ + PK V QS N D++ + +Y+TQ S + + I +L+ G 

Sbjct: 7 FDFTNITPKLFTELRVADKTVLQSFNFDEKNHQIYTTQVASGLGKDNTQSYRITRLSLEG 66 

Query: 208 DLISSMRIVQGGHGTTIGLERQSNGEMKIWLHHD GVAKLLQVAYKDNYVLDLEEA 262 

+ SM + GGHGT IG+E + NG + IW +D ++L+ YK LD E + 

Sbjct: 67 LQLDSMLLKHGGHGTNIGIENR-NGTIYIWSLYDKPNETDKSELVCFPYKAGATLD-ENS 124 
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Query: 263 KGLTDYTPQSI^KHTFTPLIDEANDKLILRFGI^TIQVRSRADVKNHIDNVEKEMTIDN 322 

K L ++ H TP +D N +L +R + D KN+ N ++ +TI N 

Sbjct: 125 KELQRFSNMPF- -DHRVTPALDMKNRQLAIR QYDTKNN - - NNKQWVT I FN 170 

Query: 323 SE NNDN - RWMQG I AVDGDDLYWLSGNSSVNSHVQIGKySLTTGQKI 367 

+ N +N ++QG +D LYW +G+++ S+ + + 
Sbjct: 171 LDDAIANKNNPLYTINI PDEIJTfLQGFFLDIXSYLYWYTGDTNSKSYPNL I TV 222 

Query: 368 YDYPFKLSYQDGINFPRD NFKEPEGICIYTNPKTKRKSLLLAMTNGGGGKRFH 420 

+D K+ Q I +D NF+EPEGIC+YTNP+T KSL++ +T+G G R 

Sbjct: 223 FDSDNKIVLQKE1TVGKDLSTRYE1WFREPEGICMYTNPETCAJCSLMVGITSGKEGNRIS 282 

Query: 421 NLYG FFQLGEYEHF 434 

+Y + YE+F 
Sbjct: 283 RIYAYH SYENF 293 



>gi | 142847 (M640S0) DNase inhibitor [Bacillus subtilia] 
Length » 125 

Score =51.9 bite (122), Expect' = le-OS 

Identities « 35/116 (30%) , Positives « 55/116 (47%), Gaps = 10/116 (8%) 

Query: 152 FELNELEPKFVMGFGGIRNAVNQSINIDKETNHMYSTQSDS QKPEGFWINKLTPSG 207 

F+ + PK V QS N D++ + +Y+TQ S + + I +L+ G 

Sbjct: 7 FDFTNITPKLFTELRVMKTVIiQSFNFDEKNHQIYTTQ^ 66 

Query: 208 DLISSMRIVQGGHGTTIGLERQSNGEMKIWLHHD GVAXLLQVAYKDNYVLD 258 

+ SM + GGHGT IG+E + KG + IW +D ++L+ YK LD 

Sbjct: 67 LQLDSMLLKHGGHGTNIGMENR - NGT I Y I WSLYDKPNETDKS ELVCF PYKAG ATLD 121 



>gi | 4038407 (AF103943) factor C protein precursor [ Strep tomyces 
griseus] 
Length =324 

Score <* 39.1 bits (89), Expect ■ 0.10 

Identities = 61/269 (22%), Positives » 102/269 (37%), Gaps « 33/269 (12%) 

Query: 172 VNQSINIDKETNHMYSTQSDSQKPEG FWINKLTPSGDLISSMRIVQGGHGTTIGLER 228 

V QS D ++ Q S P+ I +L SG+ + M ++ GHG +IG + 

Sbjct: 66 VC^SFTFDIVKRRLFVAQLKSGSPDDSGDliCITQLDFSGNKLGHMYLLGFGHGVSIGAQ- 124 

Query: 229 QSNGEMKI WLHHDGVAKLLQVAYKDNYVLDLEEAKGLTDYT PQS LLNKHTFT P 281 

+ +WD + + + + GT SLKHP 

Sbjct: 125 PVGADTYLWTEVD VNSNARGTRLARFKWNNGATLSRTSS ALAKHQ PVPG ATEKTC 179 



Query: 282 LIDEANDKLILRFGIXSTIQWSRADVKNHIDITVEKEMTIDNSENNDNRWMQGIAVDGDDL 341 

ID N+++ +R+ ++ +V+ V+D QGA+G + 

Sbjct: 180 AIDPVNNRMAIRYLTASGRRYGIYNVADIAAGVYDKPLSDVPHPTGLGTFQGYALYGSYV 239 

Query: 342 YWLSGN SSVNSHVQIGKYSLTTGQKIYDYPFKLSYQDGINFPRDNFKEPEGIC 394 

Y L+GN + NS+V + TG + + + G F+EPEG+ 
Sbjct: 240 YQLTGNPYGPDNPNPGNSYVS - - S VDVNTGALVQ RAFTRAGSTL TFRBPEGMG 290 

Query: 395 IYTNPKTKRKSLLLAMTNGGGGKRFHNLY 423 

IY + + L L +G G R NL+ 

Sbjct: 291 IYRTAAGEVR-LFLGFASGVAGDRRSNLF 318 



Query* ptj 110873 44AHJDORF003 Phage 44AHJD ORF | 6626-8389(2 1 
(587 letters) 

>gi|l38123|sp|P0433l|VG9_BPPH2 TAIL PROTEIN (LATE PROTEIN GP9) 
>gi | 75850 |pir| |WMBPT9 gene 9 protein - phage phi-29 
>gi| 215327 (M147B2) tail protein (Bacteriophage phi-29] 
>gi| 225364 jprf | (1301270D gene 9 [Bacillus sp.) 
Length » 599 

Score o 92.4 bits (226), Expect = 8e-18 

Identities = 126/618 (20%), Positives = 251/618 (40%), GapB = 71/618 (11%) 

Query: 5 TNFKFFYNT P FT - DYQNTI H FNSNKERDD YFLNGRHFKSLD YS KQPY - NFI RDRME INVD 62 

TN + + PF+ DY+NT F S+ + ++F R + + SK + F ++ ++V 
Sbjct: 9 TNVRILADVPFSNDYKNTRWFTSSSNQYNWF- -NRKSRVYEMSKVTFMGFRENKPYVSVS 66 
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Query: 63 MQWHDAQGINYMTFLS-DFEDRRYYAFVNQIEYVNDVVVKIYFVIDTIMm'QGNVLEQL 121 

+ +Y+ F + D+ ++ +YAFV ++E+ N V ++F ID + T+ ++ 

Sbjct: 67 LPIDKLYSASYIMFQNADYGNKWFYAFVTELEFKNSAVTYVHFEIDVLQTWMFDMKFQES 126 

Query: 122 SNVNIERQHLSKRTYNYMLPMIiRNNDDVLKVSNKOT 181 

I R+H+ K + P+D+L+++ + + ++F S 

Sbjct: 127 F - - - 1 VREHV- KLWNDDGTPTINTI DEGLSYGSEYDI VS VENHKPYDDMMFLVT I SKS IM 182 

Query: 182 FGT - - KKE PNLDTS KGTI YDNI T S P VNLYVMEYGDFI NFMDKMS AY PWI TQNFQK V 235 

GT ++E L+ ++ + + P+ Y+ + + D +1 N V 

Sbjct: 183 HGTPGEEESRLNDINASL-NGMPQPLCYYIHPF YKDGKVPKTYIGDNNANLSPIV 236 

Query: 236 QMLPKDFINTKDLEDVKTSEKITGLKTLKQGGKSKEWSLK-DLSL SFSNLQ 285 

ML F + D+ + +T LK K+ + LK D + H+ 

Sbjct: 237 NMLTNIFSQKSAVNDI -VNMYvTDYIGLKLDYKNGDKELKLDKDMFEQAGIADDKHGNVD 295 

Query: 286 EMMLSK KDEFKHMIRNEYMTIEFYDWNGNTMLLDAGKISQK 326 

+ + K KD+ ++ Y E D+ GN M L 1 + 

Sbjct: 296 TIFVKKIPDYEAI^IDTGDKWGGFTKDQESKLMMYPYCVT^^ 355 

Query: 327 TGVKLRTKSIIGYHNEVRVYPVDYNSAENDRPI 386 

+K++ + +G N+V DYN+ D + N+ S +N N 
Sbjct: 356 K-LKI QVRG S LGVSNKVAYSVQDYNA - - - DSALSGGNRLT AS LDSSLI NNNPN 404 

Query: 387 PI LINNGI LGQSQQANRQ- - KNAESQLI TNRIDNVLNG SDPKSRFYDAVS VASNLS P 441 

I I N h Q N+ +N +S ++ N I ++ G + + A+ +AS++ 
Sbjct: 405 DIAILNDYLSAYLQGNKNSLENQKSSILFNGIMGMIGGGISAGASAAGGSAliGHASSV-- 462 

Query: 442 TALFGKFNEEYNFYKQQQAEYKDLAI^PPSVTESEMGNAFQIANSINGLTMKISVPSPKE 501 

T + + QA+ D+A PP +T+ AF N G+ + + 

Sbjct: 463 TG MTSTAGN AVLQMQAMQ AKQ AD I AN I P PQLTKMGGNTAFD YGNG YRG VYV I KKQL KAEY 522 

Query: 502 I TFLQKYYMLFGFEVNDYNS F I EP INSMTVCNYLKCTGTYT I RDXDPMLMEQLKAILESG 561 

L ++ +G+++N + + NY++ + DI+ +++++ I ++G 

Sbjct: 523 RRSLSSFFHKYGYKINRVKK- - PMLRTRKAFNYVQTKDCFISGDINNKDLQEIRTIFDNG 580 

Query: 562 VRFWHNDGSGNPMLQNPL 579 

+ WH D GN ++N L 
Sbjct: 581 ITLWHTDNIGNYSVENEL 598 



>gi|l38124|sp|P07534|VG9_BPPZA TAIL PROTEIN (LATE PROTEIN GP9) 
>gi|75849|pir [ JWMBP9Z gene 9 protein - phage PZA 
>gi j 216058 (Mil 81 3) tail protein [Bacteriophage PZA] 
Length =599 

Score = 81.9 bits (199), Expect « le-14 

Identities = 127/618 (20%) , Positives » 248/618 (39%) , Gaps = 71/618 (11%) 

Query: 5 TNFKFFYNTPFT - DYQNTI HFNSNKERDD YFLNGRHFKS LDYS KQPYNF I RDRME - INVD 62 

TN + + PF+ DY+NT F S+ + ++F + + SK + R+ I+V 

Sbjct: 9 TNVRILADVPFSNDYKOTRWFTSSSNQYNWF--NSKTRVYEMSKVTFQGFRENKSYISVS 66 

Query: 63 MQWHDAQG INYMTFLS - DFEDRRYYAFVNQIEYVKDVWKI YFVIDTIMTYTQGNVLEQL 121 

++ +Y+ F + D+ ++ +YAFV ++EY N ++F ID + T+ N+ Q 

Sbjct: 67 IJILDLLYNASYIMFQNADYGNKWFYAFVTELEYKNVGTT^ 125 

Query: 122 SNVNI ERQHLS KRT YNYMLPMLRKNDDVLKVSNKNYVYN - - QMQQYLENLVLFQS S ADLS 179 

S I R+H+ K + P+ D+L ++++ +Y + + L S + 
Sbjct: 126 SF--IVREHV-KLWNDDGTPTINTIDEGLNYGSEYDIVSVENHRPYDDMMFLVVISKSIM 182 

Query: 180 KKFGTKKE PNLDTS KGT I YDN ITS PVN LYVMEY GD FINFMDK 221 

+ E L+ ++ + + P+ Y+ + GD +N + 

Sbjct: 183 HGTAGEAESRLNDINASL-NGMPQPLCYYIHPFYKDGKVPKTFIGDNNANLSPIVNMLTN 241 

Query: 222 MSAYPWITQNFQKVQMLPKDFINTK DLEDVKTSEKITGLKTLKQGGKSKEWS 273 

+ + N V M D+I K +L+ K + G+ KG + 

Sbjct: 242 IFSQKSAVNNI - - VZIMYVTDYIGLKLDYKNGDKELKLDK^ 299 

Query: 274 LKDL SliSFSNI^EMMLSKKDEFKHMIRNEYOTIEFYDWNGNTMLLDAGKISQKTGV^ 330 

K +L + KD+ ++ Y E D+ GN M L I +K 

Sbjct: 300 KKI PDYETLEIDTGDKWGGFTKDQESKLMMYPYCVTEVTDFKGNHMNLKTEYIDNNK-LK 358 



Query: 331 LRTKSIIGYHNEVRVYPVDVNSAENDRPIIAKNKEILIDTCSFLNTNITFNSFAQVPILI 390 
++ + +G N+V DYN+ + L+ + L+T++ N+ + 1+ 
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Sbjct: 


359 


I QVRGSLGVSNKVAYS I QDYNAGGS LSGGDRLTAS LDTSLINNNPNDIAI I - 


409 


Query: 


391 


NNGI LGQSQQANRQ- - KNAESQLITNRIDNVLNGSDPKSRFYDAVSVASNLSP - 


441 






N L Q N+ +N +S ++ N I +L G A + A SP 




Sbjct: 


410 


- NDYLS AYLQGNKNS LENQKSS I LFNG I VGMLGGG VSAGASAVGRS PFGLASSV 


462 


Query: 


442 


TALFGKFNEEYNFYKQQQAEYKDLALQPPS VTESEMGNAFQI ANS INGLTMKISVPS PKE 


501 






T + + QA+ D+A PP +T+ AF N G+ + + 




Sbjct: 


463 


TGMTS TAGNA VLDMQ ALQAKQAD I AN I P PQ LTKMG GNT AFD YGNG YRGVYVI KKQLKAEY 


522 


Query: 


502 


ITFIiQKYYMLFGFEVNDYNSFIEPINSMTVCNYIJCCTGTYT^ 


561 






L ++ +G+++N + + NY++ + DI+ +++++ I ++G 




Sbjct: 


523 


RRSLSSFFHKYGYKINRVKK- - PKLRTRKAYNYIQTKDCFISGDINNNDLQEIRTIFDNG 


580 


Query: 


562 


VRFWHNDGSGNPMLQNPL 579 








+ WH D GN ++N L 




Sbjct: 


581 


ITLWHTDDIGNYSVENEL 598 





>gi|l42923B|emb|CAA67657| (X99260) tail protein [Bacteriophage B103] 
Length =598 

Score a 77.6 bits (188), Expect » 2e-13 

Identities = 130/623 (20%), Positives = 240/623 (37%), Gaps « 86/623 (13%) 

Query: 5 TNFKFFYNTPFT-DYQNTIHFNSNKERDDYFLNGRHFKSLDYSKQPYNFI RDRMEIN 60 

T+ + F N PF+ DY++T F + + YF + K + NF+ I 

Sbjct: 9 TDVRIFSNVPFSNDYKSTRWFTNADAQYSYF NAKPRVHVTNECNFVGLKEGTPHIR 64 

Query: 61 VDMQWHDAQG INYMTFLS - DFEDRRYYAFVNQI E YVND VWKI YFVI DT I MTYTQGNVLE 119 

V+ + 0 YM F + + ++ +Y FV ++EYVN V +YF ID I T+ + 

Sbjct: 65 VNKRIDDLYNACYMI FRNTQ YSNKWFYC FVTRLEYVNSGVTNLYFE ID VIQTW- MFDFKF 123 

Query: 120 QL5NVNI ERQHLSKRTYNYMLPMIJINNDDVLKVSWKNYV^ 179 

QS+EQ+ P+ D+Ii + V Q ++F S 

Sbjct: 124 QPSYIVREHQEMWDANNE- - - PLTNTIDEGLNYGTEYDWAVEQYKPYGDLMFMVCISKS 180 



Query: 180 KKFGTKKE PNLDTSKGTIYDNITS - - - PVNLYVME YGD F I NFMDKMS A YP W I TQN FQ KVQ 236 

K T E G I NI P++ YV + + D S P +T +VQ 

Sbjct: 181 KMHATAGET FKAGEIAANINGAPQPLSYYVHPF YEDGSS- -PKVTIGSNEVQ 230 

Query: 237 ML-PKDFINTKDLEDVKTSEKITGLKT LKQGGKSKEWSLKDLSLSFSNL 284 

■«■ P DF+ ++ + ++ T + +K SL+D + + 

Sbjct: 231 VSKPTDFLKNMFTQEHAVNNIVSLYVTDYIGLNIHYDESAKTMSLRDTMFEHAQIADDKH 290 

Query: 285 QEMMLSKKDEFKHMIRNEYMTIEFY DWNGNTMLLDAGK 322 

+E + +F NE + Y D+ GN + + 

Sbjct: 291 PNWTIYIJCEVKEYEECTIDTGYKFASFANNEQS 350 

Query: 323 I S QKTGVKLRT KS 1 1 GYHNE VRVY P VDYN S AENDRPILAKNKEILIDTGSFLNTNIT 379 

++ + +K++ + +G N+V DYN+ D+ + A NT++ 
Sbjct: 351 VNG - SNLKIQVRGSLGVSNKVTYSVQDYNADTTLSGDQNLTAS CNTSLI 398 

Query: 380 FNSFAQVPILINNGI LGQSQQANRQ- -KNAESQLITNRIDNVLN GSDPKSRFYDAVS 434 

N+ V 1+ N L Q N+ +N + ++ N + ++L G+ + AV 
Sbjct: 399 NNNPNDVAI I - - NDYI^AYIX3GNKNSLENQKDSILFNGVMSKLGNGIGAVGSAATGSAVG 456 

Query: 435 VASITLSPTALFGKFNEEYNFYKQQQAEYKDIJUiQPPSVTESEMGNAFQIANSINGLTMKI 494 

VAS S T + + QA+ D+A PP + + A+ N G+ + 

Sbjct: 457 VAS - - S ATGMVS SAGNAVLQI QGMQAKQAD I ANT P PQLVKMGGNTAYDYGNG YRGVYVI K 514 

Query: 495 SVPSPKEITFLQKYYMLFGFEVNDYNSFIEPINSMTVCNYLKCTGTYTIRDIDPMLMEQL 554 

+ L + +G++ N + + + NY++ I +++ ++++ 

Sbjct: 515 KQI KEEYRNILSDFSRKYGYKTNLVK - - MPNLRTRES YNYVQTKDCNI IGNLNNEDLQKI 572 

Query: 555 KAILESGVRFWHNDGSGNPMLQN 577 

+ I +SG+ WH D G+ L N 
Sbjct: 573 RTI FDSGITLWHAD PVGDYTLNN 595 



>gi | 215339 (M12456) p9 tail protein [Bacteriophage phi-291 

>gi | 224X63 | prf | | 1011232C protein p9,tail [Bacteriophage 

phi-29] 

Length » 335 
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Score = 71.0 bits (171), Expect = 2e-ll 

Identities = 64/293 (21%), Positives = 123/293 (41%), Gaps = 20/293 (6%) 

KDEFKHMIRNEYMTIEFYDWNGNTMLLDAGKISQKTGVKLRTKS 1 1 G YHNEVR VYPVDYN 351 
KD+ ++ Y E D+ GN M L 1+ +K++ + +G N+V DYN 
KDQESKLMMVPYCVTEITDFKGNHMKLKTEYINNSK-LKIQVRGSLGVSNKVAYSVQDYN 115 



+ D + N+ S +N N I I N L Q N+ +N +S 

A- - -DSALSGGNRLTASLDSSLINNNPN DIAILNDYLSAYLQGNKNSLENQKS 165 



++ N I ++ G + + A+ +AS++ T + + QA+ D+A 



PP +T+ AF N G+ + + L ++ +G+++N 



Query: 


292 


Sbjct : 


57 


Query: 


352 


Sbjct: 


116 


Query: 


410 


Sbjct: 


166 


Query: 


467 


Sbjct: 


224 


Query: 


527 


Sbjct: 


282 



DI+ +++++ I ++G+ WH D GN ++N L 



>gi| 1181968 |emb|CAA8 773 8. l| (247794) tail protein [Bacteriophage 
CP-1] 

Length » 230 
Score =53.9 bits (127), Expect » 3e-06 

Identities * 29/113 (25%), Positives « S4/113 (47%), Gaps - 3/113 (2%) 



Query: 1 MRKLTNFKFFYNTPF-TDYQNTIHFNSNKERDDYFLNGRHFKSLDYSKQPYNFIRDRMEI 59 

M++ T + +PF DY N I+F + + +D+P + Y + + + I 

Sbjct: 1 MQESTKIWLYAKSPFKNDYANVINFETRESMEDFFTKKNPHIEIVYEYDKFQYTQRNGSI 60 

Query: 60 NVTJMQWHDAG^INYMTFLSDFEDRRYYAFVNQIEYVNDVVra 112 

k V + + + YM F+++ R YYAFV + Y+N+ +1 + +D TY 
Sbjct: 61 WSGRVEKYENVTYMRFINN- -GRTYYAFVFDVLYINEDATRI I YEVDVWNTY 111 



>gi|1181970|emb|CAA87740.l| (Z47794) tail protein (Bacteriophage 
CP-1J 

Length » 586 
Score = 42.2 bits (97), Expect « 0.010 

Identities = 79/381 (20%) , Positives » 139/381 (35%) , Gaps « 92/381 (24%) 

Query: 277 LS LS FSNLQEMMLS K--KDEFK HMI RNEYMT I EF YDWNGNTMLLD AG KISQKT 327 

L +++ +QE + S KD+ + ++ +E+ IE YD GN+ + I + 

Sbjct: 187 LKIAYDQIQEGLRSYMGKDDLEIEVQLLNSEFTEIELYDIYGNSYVYQPQYLPRTIDEAH 246 

Query: 328 G VKLRTKS I IGYHNEVRVYPVDYNSAEN DRPIL - 360 

K+ +G N+V + ++YN+A N D+ IL 

Sbjct: 247 KYKVIVSGSIiGDSlJQVHINFLEYNNANWSYADKNILDSLESGDWAEHNPEHFKYGLNDV 306 

Query: 361 -AKNKEILIDT-GSFLNTNITFNSFAQVPILINNGILGQSQQANRQKNAESQLITNRIDN 418 

K+ IL D S++ ++ Q+ N +L QS + ++ A + + 

Sbjct: 307 TGKSVAILNDAEASYIQSHKNQMEHTQLTFKENRDMLKQSVDLSNKQVATANSQASYNAQ 366 

Query: 419 VLNGSDPKSRFYDAVSVASNLSPTALFGKF- - -NEEYNFYKQQQ- - 4 59 

S +++ + S N++ L G F N +YN QQ 

Sbjct: 367 FAVDSANINQWTEGASGILNVAGNLLTGNFGGALGGLASGGMKVFNANRDYNDKWQXyjF 426 

Query: 460 AE YKDLALQ PP S VTE S EMGNAFQI ANS I N 4 88 

A DL QP SV + AFQ N + 

Sbjct: 427 TSENNALKSQSNAIJ^KSKIAI^SIRAYNATMADLQNQPISVQQIGNDLAFQSGNRLT 486 

Query: 4 89 GLTMKISVPSPKEITFIjQKYYMLFGFEVNDY-NSFIEPINSMTVCNYLKCTGTY--TIRO 545 

+ K+S+ ++ +Y +GVN + N + +S NY+K T+R 
Sbjct: 487 DVYWKVS LAQKE IMG RANEY I KCYG VLVNWFTNDALS VMRS RKRFNY I KM INVNLGTLR - 545 

Query: 546 I D PMLMEQLKAI LESGVRFWH 566 

+ M ++AI +SGVR W+ 
Sbjct: 546 ANQSHMNAIQAI FQSGVRIWN 566 
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Query- pt| 110875 44AHJDORF005 Phage 44AHJD ORF | 12643-13890 | -1 1 
(415 letters) 

>gi | 3845203 (AE001399) GAP domain protein {cyclic nt signal 
transduct.) [Plasmodium falciparum] 
Length * 1245 

Score « 52.3 bits (123), Expect » 6e-06 

Identities = 59/246 (23%), Positives = 105/246 (41%), Gaps = 27/246 (10%) 



+S D N+ N + + N+V FS+ N IY++L N +YK + E+ 

DSSDNNNNNNNNNNNNNNYNNNNSVI FST NEKIYDML NRDNIYKKVKKEIF 904 

RNDYVNEXRNTRAFNSNDDAMTTGEFEFNEHfNLADDN^ - 291 

D + + + +N + M +NN ++N+ N+ N NGD Y KY 



++N ++ + ++ KE K+ I + L +F+K NM 



NAY + N KIF+EK+MF+ +KIY+ N + N K 

SNAYGEKCFFFN FPQIKEIIFWEYEKKMDMKYFKMLKKIYKYNLNKIFSNNYK 1073 



Query: 


174 


Sbjct: 


854 


Query: 


234 


Sb j ct : 


905 


Query: 


292 


Sbjct: 


965 


Query: 


351 


Sbjct: 


1019 


Query: 


407 


Sbjct: 


1074 



+++K+ 



>gi|3758843|etnb|CAB11128.l| (Z985S1) predicted using hexExon; 

MAL3P6.23 (PFC0820w) , Hypothetical protein, lent 4982 aa 
(Plasmodium falciparum] 
Length « 4981 

Score » 49.2 bits (115), Expect = 5e-05 

Identities « 67/287 (23%), Positives = 110/287 (37%), Gaps *» 60/287 (20%) 

Query: 127 ITDLNSATDLKYHSNFLKHYPIIIYDEFLALEDDYLIDEWDKLKT IYES1DRNHGN 182 

I D+N + D+ + +++ I YD +++DK++ IY +ID++ N 

Sbjct: 3619 IMDINKSKDISXNMEIVQN---IEYD-- NKYDKIRNDMDAIYMAIDKDMDN 3664 

Query: 183 VDYXGFPKMFLLGHAVNFSSPI LSNLNI YML LQKHKMNTSRLYKNI FLEMRRNDYV 238 

+ 1 +FL NS +N YNL ++ K N R Y N F +D 
Sbjct: 366S I G 1 1 NCMRYFNL YKNYNN LS NECNNRE - YNLNELYHEDI KRNMKR - YDNNFNINHYDDNN 3722 

Query: 239 NEKRNTRAFNSNDDAMTTGEFEFTCEYNLADDNLRN^ 298 

N N N+N++ N N ++N N+ N KG F+ 0 
Sbjct: 3723 NNNNNNNNNNNNNNNNNNNNNNNNN^ 3771 

Query: 299 TFMTNIIWPYTKQYEFCTKIRDIDNHVTYIJa3DMFTKENMERYYYNPSKIiH 358 

K FCTK ++F +N+E N N N Y+ N 

Sbjct: 3772 KDLFFCTK KNI FPCKN I ETVCKNEYNKKI YKNYTCK 3807 

Query: 359 YVVDNDRYLYLDMNKIIKFHIKNEMKKNMSEFERKEK-IYEDNYTEN 404 

V+N + ++IK + + N E+ + EK +Y + EN 

Sbjct: 3808 ISVNNTLNCI^IIKELIKLNNNKKKIIiNYYEYHKVEKLLYYRHS 3854 

Score =35.6 bits (80), Expect =0.70 

Identities * 62/290 (21%), Positives » 121/290 (41%), Gaps « 65/290 (22%) 

Query: 2 VKQNRLDMVRD YQNAVN - - HVRKKI PDKYNQI ELVDELMNDD IDYYI S ISNRSDGKS FNY 59 

+K+N ++ +N +N +V++ DK N I D++I+ SN + +SF 

Sbjct: 4445 IKRNNINKSNIKRNNINKSNVKRSNTDKSNVIS DFHIT-SNNNITRSFT- 44 92 

Query: 60 VSFFIYLAIKLDIKFTLIiSRHYTLRDAYRDFIEEIIDENPLFKSKRVTFRSARDYLAIIY 119 

A D F LS TL +Y +F + + I 
Sbjct: 4493 ATLTDSIFNTLSE- -TLNYSYDNFFSNMDN IKI 4523 

Query: 120 QDKEIGVTTDLNSATDLKYHSNFLKHYPIIIYDEFL ALEDDYL I DEWDKLKT I YE 174 

+ EI 1TD++ +YH N+LK + +E++ + +D + DE ++T+ E 

Sbjct: 4524 KKNEINNITDVDYGNKKEYHENYLKVXQNKVNEEYIEETFKSDKDCSIKDEACTIRTLSE 4583 
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Query: 17S S--IDRNHGNVDYIGFPKMFLU3NAVNFSSPILSNLNIYNLLQKHKMN- -TSRLYKNIFL 230 

SIN N+D + + + S P N++ N ++K+ +N R+ KN 

Sbjct: 4584 SCNISENISNID MDDEDHISFPNGRNVHDNNYMKKNHVNYDKMRVGKNKIP 4634 

Query: 231 EMRRNDYVNEKRNTRAF1ISNDDAMTTGEFEFTIEYNLADDNIJINHINQNCT 280 

D + +++ + +D M++ ++ E ++ + L + NG+ 
Sbjct: 4635 SFTHFDKILDEKKKK SDKDMSSSKWLEREEHIKEIKLEKNEYMNGN 4680 

Score =34.0 bit3 (76), Expect ■ 2.0 

Identities = 47/211 (22%), Positives = 84/211 (39%), Gaps « 32/211 (15%) 

Query: 210 IYNLU3KHKMNTSRLYKNIFLEMRRNDYVNE 269 

I++LLQK LY+N+ + R + N+ T E ++ + ++ 

Sbjct: 918 IFSLLQKDSSPLLVLYENVHI - REGEKYGRNE - - ATDNEVDYKKGDI I KH 964 

Query: 270 NLRNHINQNGDFFYIKTD DKYIKTVT^YNVTTFMTNIIVVPYTKQYEFCTKIRDIDNHV 326 

N+ N + D + D+ K MY + V E K D+ N+ 

SbjCt: 965 NVTNEHGNHSDSYPYGNSLNLDRKPKNMYE- DIYKEKGFVKSDCSNIEI - - KKNDMINND 1021 



Query: 327 TYIJU)DMFYKENMERYYYNPSNLHFDNAYSKNYVVDNDRYLYLDMNKI I KFHIKNE 382 

y +++ py+++ Y+ + YV++ +YL +N ++ F +KN+ 

Sbjct: 1022 VYKKNE - FYEDSRINMI YDEDEIKTWFLI PHKYVIN 1 IYLFLNILLTDESNFKLKNK 1077 

Query: 383 MKKNMSEFERXEKIYEDN YIENTKKY 4 08 

E K IYEDN ++N KKY 

SbjCt: 1078 KYGYFVNEETKGTIYEDNNGLQEILKNGKKY 1108 

Score « 33.fi bits (75), Expect = 2.7 

Identities » 42/198 (21%), Positives = 77/198 (38%), Gaps » 42/198 (21%) 

Query: 222 SRLYKN I FLEMR RNDYVNEKRNTRAF NSNDDAMTTGE FEFNEYNIA 267 

S LY I++ + +N K+NT + N+++D TT E + + 

Sbjct: 411 SVLYS 1 1 YMNKKYKKKNFI ITNKKNTNVYFENDVIQLS VENTSEDTFTTNTRES SLNS GM 470 

Query: 268 DDNLRNHINQNGDFFYIKTDDKYIKVMYlTVTrF>rrNIIWPYTKQYEFCTK^ 327 

+++R +N D +DDK ++Y N YTK E 
Sbjct: 471 MNDMRYSVNNYADEKVYHSDDKSDHLIYKHVHDEKNKYDEMYTKTKE - 517 

Query: 328 YLRDDMFYKENMERYYYNPSNIJJFDNAYSKNYVVDND^ 387 
+++ YK N+ + N K LD+ K I H+KN+ + N 



Sbjct: 


518 


Query: 


388 i 


Sbjct: 


564 1 


>gi| 3845297 


Score 


= 46 


Identities 


Query: 


20 


Sbjct: 


1049 


Query: 


75 


Sbjct: 


1109 


Query: 


128 


Sbjct: 


1169 


Query: 


184 


Sbjct: 


1215 


Query: 


236 



+ ++K + + + YI+N 



Length * 2380 

.0 bits (112), Expect = le-04 



VRKKIPDKYNQIELVDELMNDDIDYYISISNRSDGKSFNYVSFF I YLAIKLDIKF 74 

+++K +K ++ + +N D + ++ R K+ NY++ +YL I DI 



TLLSRHYTLRDAYR- DFIEEIIDEN-PLFKSKRVTFRSARDYLAIIYQDKEIGVI 127 

+Y +++ Y + + + EN + +++ ++ + Y +K+ 



D+N D+ ++ +K+ II EFL L+ D I + KLKT ++ 
EDMNEL-DI LVNTYDMKYDKI I EFLKNNGYLKIDRYIYFYPKLKT DI 1214 

DYIGFPKMFLLGNAVNFSSPILSNLNIYNLLQKHKMNTSRLY KNIF- -LEMRRN 235 

F ++FIi N + L NI +++ K + Y K IF + M+ + 



D+V K N+ FN+ D + N YN D+ N+ N N +Y K 
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Sbjct: 1274 DHVMNKNYYNNQYVNNSNMFNTRGDHNN^ 1332 

Query: 288 DKYIKVMYNVTTPMTNI IV- - - VPYTKQYEFCTKIRDIDNHVTYLRDDMFYKEN ME 340 

+K K+MY +•++ + V K + K I + Y+++ N + 

Sbjct: 1333 NKN-KIMYEKERKSSSLFISWNVQDVKPIKHYLKYSSIYKNPIYIISEIKNFNNKITKIN 1391 

Query: 341 RY - YYNPSNLHFDNAYSKNYWDNDRYLYL 369 

RY YYN NL+ D+ ND YL+L 

Sbjct: 1392 RYNYYNYMNLNIDDL NDAYLFL 1413 

Score = 32.5 bits (72), Expect 6.0 

Identities = 46/183 (25%) , Positives » 73/183 (39%) , Gaps = 26/183 (14%) 

Query: 225 YKNIFLEMRROTYVNEKRNTRAFNSNDDAMTTGEFE 284 

+KNI ++ ++N + NSN + + N N+ +N N IN + I 

Sbjct: 27 HKNINKNIKNKKFINIDNSNNCNNSNSNNSNSNNNNNNNNNIVRN^ 85 

Query: 285 KTDDKVI KVMYNVTTFMTNI I WPYTKQYEFCTKI RDI DNHVTYLRDDM FYKENMERYYY 344 

+D IK V NI Y ++■ + D+ N+ + + KE ER 
Sbjct: 86 LNEDDDIKNKELVDESFVNIFF- - YENYFKNLFNLNDVSNNKVI - -NIIEQKEGDER 138 

Query: 345 NPSNIJiFDNAYSKNYVVDNDRYLYlJ^HNKIIKFHIKNEMKKNMSEFERKEKIYE^ 404 

N N N +KN V DN +NK IKN +N++E Y N++ + 

SbjCt: 139 NADN NLKNKN I VRDN INK IKN--TRNVNEILIYNNKYIINFLND 180 

Query: 405 TKK 407 
T K 

Sbjct: 181 TTK 183 

>gi | 4493936 |emb|CAB38972.l| (AL034556) predicted using hexExon; 

MAL3P5 . 6 (PFC0600w>, Hypothetical protein, len: 250 aa 
[Plasmodium falciparum) 
Length =249 

Score « 47.3 bits (110), Expect » 2e-04 

Identities * 53/215 (24%) , Positives = 87/215 (39%) , Gaps » 30/215 (13%) 

NIYNLLQKHKMtn'SRLYKNIFLEMRRNDYVNE -NEYNL 266 

NIYN L++ YKN N ++ +N N+N EFE N YN 

NIYNKLEEK YKNFUCLKNMNSHMGASQNMW-NNNYTMNELEEFEKINNNYNN 64 

ADDNLRNHINQNGDFFYIKTD DKYIKVMYNVTTFMTNIIVVPYTKQYEFCTKIRD 321 

++N+ N+IN D+ IK +K ++ YN +1 T +++ 



Query: 


209 


Sbjct: 


13 


Query: 


267 


Sbjct: 


65 


Query: 


322 


Sbjct: 


125 


Query: 


377 


Sbjct: 


183 



EN + N + N+ S NY DN+ LY +N++ K 



KI KKY++K 
SKI DKKYIIK 209 



>gi | 3845165 (AE001390) hypothetical protein [Plasmodium falciparum] 
Length » 1247 

Score « 45.7 bits (106), Expect = 6e-04 

Identities = 52/239 (21%), Positives = 94/239 (38%), Gaps = 38/239 (1S%) 



+N N +N ++K K R I +N + +N ++N+D EN N 



D+N N+ + N D I D+ Y +YN T ++ YTK + + 



[ DNHVTYLRDDMFYKENME RYYYN PSNLHFDNAYS 356 

+ + ++ + FY++N + ++YYN + N 

- - DMLPS I KFETFYEKNTDHKNFNENYKFYYNTDDDTD I INAIKKKNVKNKKKNGN I VI 64 9 



KNY+ N+ Y YL+ N+ + I + K +E K+ 1+ ++Y E 



Query: 


206 


Sbjct: 


474 


Query: 


266 


Sbjct: 


534 


Query: 


321 


SbjCt: 


593 


Query: 


357 
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Sbjct: 650 KNYINHNE-YSYLEYNENKNYEINKKEKLL^ 707 
Score = 41.0 bits (94) , Expect *» 0.016 

Identities = 58/245 (23%), Positives » 96/245 (38%), Gaps = 43/245 (17%) 

Query: 207 NLNIYNLLQKHKMOTSRLYKHIFl^MRRNDY^ 266 

N+N+YN + KK YF+D+ + + N D E YN 

Sbjct: 564 NINLYNE3TCKKKCMLI3NSYTKYFFYIFTL 623 

Query* 267 ADD NLRNHINQNGDFF- - - YIKTDDKYIKVMYNVT-TFMTNIIWPYTKQ 312 

DD N++N +NG+ YI ++ Y + YN + N T+ 

Sbjct: 624 DDDTDI INAIKKKNVKNK- KKNGNIVI KNYINHNE - YSYLEYNENKNYEINKKEKLLTEN 681 

Query 313 YEFCTKIRDIDNHVTYLRDDMFYKENMERYYYNPSNLHFDNAYSK NYV--VD 362 

YE+ I+D ++ Y D + + YN +N +N Y K +Y+ VD 

Sbjct: 682 YEYDMY I KDN IH YNDYSEGDGKQTKXAS S FLYNNNN - - - NNKYKKEDNKTQI I SYMDHVD 73B 

Query- 363 NDR YLYIxDMNKI I KFH I K- NEM KKNMSEFERKEKIYEDNYIENTKKY 408 

M+ Y + +++ F +K N+M K+ F +E I + +EN K+ 

Sbjct: 739 NENGVKGLKKRNLFYNNSDQLYNFDVXDNDMIKYEKRQSI^FVEEEFINGNRKMENEDKH 798 

Query: 409 LMKQY 413 
L K Y 

Sbjct: 799 LKKHY 803 

Query= pt| 110877 44AHJDORF007 Phage 44AHJD ORF | 2044-3027(1 1 
(327 letters) 

>gi| 1181960 |einb|CAA87731.l| (Z47794) connector protein 
[Bacteriophage CP-lJ 
Length 337 

Score = 45.7 bits (106), Expect =* Se-04 

Identities » 44/184 (23%), Positives » 84/184 (44%), Gaps » 13/184 (7%) 

Query: 127 QI HKLYDNCMSGNFWMQNKP I QYNSD I E 1 1 EHYTD ELAE VALSRFSL I KQAKFS K- - IF 184 

++HK + + +V+ N Y I +E + ++LA++ L+ L A+ + IF 
Sbjct: 125 ELHKDNPDKIKRPCIVT PNNNF-YEPYIGYLELFCEKLADIELT- IQLNRNAQITPYFIF 182 

Query: 185 KS E INDES INQLVS E I YNG AP FVKMSPM FNAD DDIIDLTSNSVIPALTEMKR 236 

N S+ + ++I N P V ++ + D D I + h ++ 

Sbjct: 183 ADNTNVLSMKN I FNKI ANFEP WYLNKQKDQDGQDS FKQLS DYI QVFRTDAPFIiLDKLHD 242 

Query- 237 EYQNKISELSNYLGINSIAVDKESGVSDEEAKSNRGFTTSNSNIYLKGREP-ITFLSKRY 295 

E +++L ++GIN+ DK+ + EASNG ++N + KR + ++K Y 
Sbjct: 243 EKLRVMNQLLTFIGINNNPSDKKERLWSEAISNNGVISANIEVGWKSRRKFVELINKCY 302 

Query: 296 GLDI 299 
GL+I 

Sbjct: 303 GLEI 306 

>gi| 1429239 |emb|CAA67658| (X99260) upper collar protein 
[Bacteriophage B103) 
Length = 308 

Score =44.9 bits (104), Expect = 8e-04 

Identities * 40/159 (25%). Positives « 73/159 (45%), Gaps = 11/159 (6%) 

Query: 150 YNSDIEI IEHYTDELAEVA-LSRFSLIMQAKFSKIFKSEINDESINQLVSEIYNG 203 

YN+D++ +E + +LAE+ + + Q I ++ N S+ + ++ 

Sbjct: 121 YNNDLKCSTLPALEMFAQDLAELKEIIAVNQNAQKTPVIilAANDNNQLSLKNIYNQYEGN 180 

Query: 204 APFVKMSPMFNADD-DIIDLTSNSVIPALTEMKREYQNKISELSNYLGINSLAVDKESGV 262 

AP + + + D+ + + V+ L K N E+ YLGI + ++K+ + 
Sbjct: 181 APVI FVHESI^IiDNLKVFKTDAPYVVDKLNAQKNAVWN EVMTYLGIKNANLEKKERM 237 

Query: 263 SDEEAKSNRGFTTSNSNIYLKGR-EPITFLSKRYGLDIK 300 

E SN S+ NIYLK R E +S+ YGL++K 

Sbjct: 238 VTSEVDSNDEQIESSGNIYLKARQEACNKISELYGLNLK 27S 

>gi[137915|sp|P07535|VG10_BPPZA UPPER COLLAR PROTEIN (CONNECTOR 

PROTEIN) (LATE PROTEIN GP10) >gi | 75851 i pir | (WMBP10 gene 
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10 protein - phage PZA >gi| 216059 (M11813) upper collar 
protein [Bacteriophage PZA] 
Length * 309 

Score = 43. B bits (101) , Expect = 0.002 

Identities = 38/160 (23%) , Positives = 75/160 (46%), Gaps = 13/160 (8%) 



Query: ISO YNSDIEI IEHYTDELAEVALSRFSLIMQAKFSKIF- -KSEINDESINQLVSEIYN 202 

YN+D+ +E + ELAE+ S+ A+ + + ++ N S+ Q+ ++ 

Sbjct: 122 YNNDMSFPTTPTLELFAAELAELK-EIISVNQNAQKTPVLIRANDNNQLSLKQVYNQyEG 180 

Query: 203 GAPFVKMSPMFNADD-DIIDLTSNSVIPALTEMKREYQNKISELSNYLGINSLAVDKESG 261 

AP + ++D ++ + V+ L K N E+ +LGI + ++K+ 
Sbjct: 181 NAPVI FAHEALOSDS I EVFKTDAP YWDKLNAQKNAVWN EKMTFLG I KNANLE KKER 237 

Query: 262 VSDEEAKSNRGFTTSNSNIYLKGR-EPITFLSKRYGLDIK 300 

+ +E SN S+ ++LK R E +++ YGLD+K 

SbjCt: 238 MVTDEVSSNDEQIESSGTVFLKSREEACEKINELYGLDVK 277 



>gi| 137914 |sp|P04332|VG10_BPPH2 UPPER COLLAR PROTEIN (CONNECTOR 

PROTEIN) (LATE~PROTEIN GP10) >gi | 75852|pir| jWMBPC9 gene 
10 protein - phage phi-29 >gi| 215328 (M14782) upper 
collar protein (Bacteriophage phi-29j >gi| 215340 
(M124S6) plO connector protein (Bacteriophage phi-29 J 
>gi | 224161 jprf | j 1011232A protein plO, connector 
(Bacteriophage phi-29] >gi j 22S365 | prf | | 1301270E gene 10 
[Bacteriophage phi-29] 
Length = 309 

Score = 41.4 bits (95), Expect = 0.009 

Identities = 37/160 (23%), Positives = 75/160 (46%), Gaps = 13/160 (8%) 

Query: 150 YNSDIEI 1 EHYTDELAEVALSRFSLIMQAKFSKIF-- KSEINDESINQLVSEIYN 202 

YN+D+ +E + ELAE+ S+ A+ + + ++ N S+ Q+ ++ 

Sbjct: 122 YNNDMAFPTTPTLELFAAELAELK-EIISVNQNAQKTPVliIRA 180 

Query: 203 GAPFVKMSPMFNADD-DIIDLTSNSVIPALTEMKREYQNKISELSNYLGINSLAVDKESG 261 

AP + ++D ++ + V+ L K N E+ +LGI + ++K+ 
Sbjct: 181 NAPVI FAHEAIJDSDS I EVFKTDAP YVVDKLNAQKN A VWN EMMTFLG I KNANLE KKER 237 

Query: 262 VSDEEAKSNRGFTTSNSNIYLKGR- EPITFLSKRYGLDIK 300 

+ +E SN S+ ++LK R E +++ YGL++K 

Sbjct: 238 MVTDEVSSNDEQIESSGTVFLKSREEACEKINELYGLNVK 277 

Query* pt| 110878 44AHJDORF008 Phage 44AHJD ORF |3020-3775|2 1 
(251 letters) 

>gi| 4982468 |gb|AAD30963. 2 | (AF118151) SNF1/AMP -activated kinase 
[Dictyostelium discoideum) 
Length s 718 

Score * 52.3 bits (123), Expect = 3e-06 

Identities » 28/118 (23%), Positives « 56/118 (46%), Gaps « 5/118 (4%) 

Query: 121 YLQSQGFTEHNEDTTSNTDETSNQNATSLDNSTGMTANRNAYV SLPQSEVNIDVDN 176 

+ + GF N ++ SN + +N N + N+ T N N + ++ + +N + +N 
Sbjct: 382 FTTTTCFNPTNSNSISNNNNNNNNNNNNT^NNNNNTTNNN^ 441 

Query: 177 TTIJIFADNNTIDNGKTVNKSSNESNQNAKRNQNQKGNAKGTQFTKQYLID-NIDKAYD 233 

+NN I+N N ++N +N N N N N+ + T+ + I N++ +Y+ 
Sbjct: 442 NNNNINNNNI INNNNNNNNNNNNNNNNNNNNNNNNNNSS I SGGTEVFS IS PNLNNS YN 499 



Score a 37.5 bits (85), Expect * 0.094 

Identities = 17/111 (15%) , Positives * 45/111 (40%) 

Query: 130 HNEDTTSNTDETSNQNATSLDNSTGMTANRNAYVSLPQSEVNIDVDNTTLRFADNNTIDN 189 

+N + +N + +N N + +N++ ++ + P + + +++ N+ ++ 

Sbjct: 456 NNNNNNNNNNNNNNNNNNNNNNNS S I SGGTEVFS I S PNLNNS YNSNS SGNSNGSNSNNNS 515 

Query: 190 G KTVNKS S NESNQNAKRNQNQ KGNAKGTQFTKQYLI DNID KAYDLRKKI LN 24 0 

N +N +N N N N N ID+++ + + + N 

Sbjct: 516 NNNTNNDNNNNNNNNNNNNNNNNNNNNNNNNNNNC I DS VNNS LNNENDVNN 566 
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Score * 32.8 bits (73), Expect =2.4 

Identities = 31/140 (22%), Positives = 57/140 (40%), Gaps = 14/140 (10%) 

Query: 109 LNWYSSSEVEKYLQSQGFTEHNEDTTS NTDETSNQNATSLDNSTGMTANRNAYVSL 165 

LN Y+S+ S N+T+ N++NN + +N+ N N + 

Sbjct: 494 LOTISYNSNSSGNSNGSNSNNNSNNNTmroNNNN^ 553 

Query: 166 PQSEVN- -IDVDNTTLRFADNNTIDNGKTVNKSS- - NESNQNAKRNQNQKGNAK 215 

+ +N DV+N+ + +NN D+G N ++ N N + N GN 

Sbjct: 554 WNSLNNENDWNSNINNNNNNNSDDGSNHN^ 613 

Query: 216 GTQFTKQYLIDNIDKAYDLR 235 

Q L++++D D++ 
Sbjct: 614 NLNNNFQ-LLNSLDLNSDIQ 632 



Score = 31.7 bits (70), Expect « 5.4 

Identities = 25/115 (21%), Positives » 48/115 (41%), Gaps = 10/11S (8%) 

Query: 130 HNEDTTSNTDETSNQNATSLDNST GMTAN-RNAYVSLPQSEVNIDVDNTTLRFADNN 185 

+N + +N + +N N +S+ T ++ N N+Y S S N + N+ +N 
Sbjct: 462 NNNNNNNNNNNNNtJNNNSSISGGTEVFSISPNLNNSYN^ 519 

Query: 186 TIDNGKTVNKS SNESNQNAKRNQNQKGNAKGTQFTKQYL I DNIDKA YDLRKKI LN 240 

DN N ++N +N N N N N + ++++ D+ +N 

Sbjct: 520 NNDN NNNNNNNNNNNNNNNNNNNNNNNNNNCIDSVN^ 570 



Score =» 31.7 bits (70), Expect = 5.4 

Identities = IS/104 (14%), Positives « 43/104 (40%) 

Query: 110 NVV^SSSEVEKYLQSQGFTEHNEDTTSNTDETSNQNATSI^NSTGOTANRKAYVSLPQSE 169 

N+ +++ + + +N + +N ♦ +N N + +N+ + + V 

Sbjct: 434 NINNNNNNNNNNINNNNI INNNNNNNNNNNNNNNNNNNNNNNNNNS S ISGGTEVFS I S PN 4 93 

Query: 170 VNIDVDNTTLRFADNNTIDNGKTVNKSSNESNQNAKRNQNQKGN 213 

+N ++ + ++ + +N N +++ +N N N N N 
Sbjct: 494 LNNSYKSNSSGNSNGSNSNNNSNNNTNNDNNNNNNNNNNNNNNN 537 



Score =30.9 bits (68), Expect » 9.2 

Identities « 16/84 (19%), Positives =* 34/84 (40%) 

Query: 130 HNEDTTSNTDETSNQNATSLDNSTCMTANRNAYVS L PQSE VNID VDNTTLRPADNNT I DN 189 

+N + +N + +N N + +N+ + S+ + N N++ +N+ +N 

Sbjct: 455 NNIOJNNNNNNNNNNNNNNNNNNNNS 514 

Query: 190 GKTVNKS SNESNQNAKRNQNQKGN 213 

+ N +N N N N N 
Sbjct: 515 SNNNTNNDNNNNNNNNNNNNNNNN 538 



>gi|l730077{sp|P18160|KYKl DICDI NON-RECEPTOR TYROSINE KINASE SPORE 
LYSIS A (TYROSINE- PROTEIN KINASE 1) >gi| 974334 (U32174) 
non- receptor tyrosine kinase [Dictyostelium discoideum] 
Length = 1584 

Score =46.5 bits (108), Expect » 2e-04 

Identities = 29/106 (27%), Positives - 48/106 (44%), Gaps = 4/106 (3%) 

Query: 130 HNEDTTSNTDETSNQNATSLDNSTGMTANRNAYVSLPQSEVNID VDNTTLRFADN - N 185 

+NED +SN + +N N + +N+ N N + + N + ++NTT N N 

Sbjct: 442 NNEDISSNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNSNSSNTffi 501 

Query: 186 TIDNGKTVNKSSNESNQNAKRNQNQKGNAKGTQFTKQYLIDNIDKA 231 

+N N +SN +N N N N N TK+ I + D++ 

Sbjct: 502 NNNNNNNSNSNSNSNNNNINNNNNNNNNNNNIYLTKKPSIGSTDES 547 



Score a 34.0 bits (76), Expect =1.1 

Identities a 20/117 (17%) , Positives a 46/117 (39%) 



Query: 87 NRQTVEAFGMQVITVCITHEDYLNVVYSSSEVEKY^ 14 6 

N G IT T + + ++++ + +N + +N + +N N 
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Sbjct: 415 NNNNNNIIGNGKITTTTTTSTSPSSINNNEDISSNNN^ 4 74 

Query: 147 TSLDNSTGOTANRNAYVSLPQSETWIDVDNTTI^FADNNTIDNGKTVNKSSNESNQN 203 

+ ++++ TNN + + N + +N N+ +N N ++N +N N 

Sbjct: 475 NNNNSNSSinTNNlWINNTTNNNNSNSNNNNNNNN^ 531 

Score =33.2 bits (74), Expect = 1.8 

Identities = 18/88 (20%) , Positives = 35/88 (39%) 

Query: 130 HNEDTTShTn3ETSNQNATSLDNSTGMTANRNAYVSLPQSEVNIDVDNTTLRFADNNTIDN 189 

+N + ++n + +N N T T + S+ +E +N +NN +N 

Sbjct: 405 NNNNNSNNNNNNNNNNIIGNGKITTTTTTSTSPSSINNNEDISS 464 

Query: 190 GICTVNKSSNESNQNAKRNQNQKGNAKGT 217 

N ++N +N N+ + NT 
Sbjct: 465 NNNNNNNNNNNNNNSNS SNTNNNN I NNT 4 92 



Score =32.5 bits (72), Expect » 3.1 

Identities a 18/94 (19%) , Positives - 37/94 (39%) 



Query: 120 KYLQSQG FTEHNEDTTSNTDETSNQNATS LDNSTGMTANRNAYVS LPQS EVN IDVDNTTL 179 

K + S N + +N++ +N N ++ + +T S N D+ + 

Sbjct: 392 KNVNSTSILVPNGNNNNNSJINNNNNNNIWIIGNGKITT^ 451 

Query: 180 RFADMNTIDNGKTVNKSSNESNQNAKRNQNQKGN 213 

+NN +N N ++N +N N + + N 
Sbjct: 452 NNNNNNNNNNNNIJNNNNNNNN^ 485 

Score = 32. S bits (72) , Expect * 3.1 

Identities = 24/110 (21%), Positives = 44/110 (39%), Gaps * 10/110 (9%) 

Query: 138 TDETSNQNATSLDNSTGMTANRN AYVS LPQS EVNI DVDNTTLRFADNNT IDNGK 191 

T T++ + +S++N+ +++N N + + N + +N +NN N 

Sbjct: 429 TTTTTSTSPSSINNNEDISSNNNNNNNNNNNN^ 488 

Query: 192 TVNKSSNESNQNAKRNQNQKGNAKGTQFTKQYLIDNIDKAYDLRXK 237 

T N +SN +N N N N N+ +N + L KK 

Sbjct: 489 IN*nTNNNNSNSNNNNNNNNSNSNSNSNNNNINNN^ 538 

>gi|375885S|etnb|CAB11140.l| (Z98551) predicted using hexExon; 

MAL3P6.11 (PFC0760c), Hypothetical protein, len: 3395 aa 
[Plasmodium falciparum] 
Length => 3394 

Score a 46.5 bits (108), Expect = 2e-04 

Identities = 52/202 (25%) , Positives » 96/202 (46%) , Gaps = 32/202 (15%) 

Query: 21 FNEFVNDNKLTFYDDEFQFMQKMLKFD-KDVLAIVNEKVFKGFSLKDELSDL- -LFKKSF 77 

F ++ ++ K T D+ M+K K D DV + NEK++ L ++L+ + + KK 
Sbjct: 665 FEKYCSNIKNTLIRDD- - -MKKFRKPDISDVHILHNEKIYLEKLLNEKLNYIKDXEKKIiD 721 

Query: 78 TIHFLDREINRQTVEAFGMQV ITVCITHEDYUJVVYSSSEVEKYLQSQGFTEHNE 132 

+H + IN+ + + +QV IV + DY + S + + K + +N 

Sbjct: 722 ELHGV- - - INKNKED I YI LQVEKQTLI KVI S S VYD YTKME - S ENH I F KMNTTWNKMLNNV 777 

Query: 133 DTTSNTDETSNQNATSLDNSTGMTANRNAYVS LPQS EVNI DVDNTTLRFADNNTIDNGKT 192 

+SN D +NQN +++N+ + N+N N +++N + N +N 

Sbjct: 778 HMSSHKDY-NNQNNQNIENNQNIENNQN NQNIEN NQNIENNQNN 820 

Query: 193 VNKSSNESNQNAKRKQNQKGNA 214 

N +N++NQN + NQN + NA 
Sbjct: 821 QNNQNNQNNQNNQNNQNNQNNA 842 

Score =* 33.6 bits (75), Expect = 1.4 

Identities » 46/221 (20%), Positives = 89/221 (39%), Gaps = 37/221 (16%) 

Query: 10 DFIKSELIKKGFNEFVNDNKLTFYDDEFQFMQKWLKFDKDVLAIVNEKVFKGFSLKDELS 69 

D+KEK N ++LY++ M+K K + V K SL 

Sbjct: 367 DSLKIEYNKSKTNIQQLNEQLVNYKNFIKEMEKKYK QLWKNNSLFS ITH 416 

Query: 70 DLLFKKSFTI HFLDRE INRQTVEAFGMQVI TVCI TH : EDYLNWYS S SEVEKYLQSQG 126 
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D+K+I+R+ + + +++IH +D+L+V+Y + + L + 
Sbjct: 417 DFINLKNSNI I IIRRTSDMKQI FKMYNLDIEHFNEQDHLSVIY IYEILYNTN 468 

Query: 127 FTEHNEDTTSNTDETSNQNATSLDNSTGOTANIWAYVSLPQSEVNIDVDNTTLRFADNNT 186 

+K D +N D +N N + +N+ N N N + +N + 
Sbjct: 469 - DNNNNDNDNNMDNKNNNNNNNDNNNNTJNNDNNNN NNNYNNIMM M 512 

Query: 187 I DNGKTVNKS SNESNQNAKRNQNQKGNAKGTQFTKQ YLI DN 227 

I+N + N +++ N+NN+N +++Y I+N 
Sbjct: 513 IENMNSGNHPNSJJNLHNYRHNTNDENNLSSLKTSFRYKINN 553 

Score =32.8 bits (73) , Expect » 2.4 

Identities = 28/122 (22%), Positives a S3/122 (42%), Gaps « 2/122 (1%) 

Query: 119 EKYLQSQGFTEHNEDTTSOTDETSNQNATSLDNSTGMTANRNAYVSLPQSEWID-VDNT 177 

E Y S + +++ N + +N + + DN+ N N ++ +N D ++N 

Sbjct: 2838 ENYPVSTHYDNNDDINKIMINNDNNNDNIND^ 2897 

Query: 178 TLR FADNNTIDNGKTVNKS SNESNQNAKRNQNQKGNAKGTQFTKQYLI DNIDKAYDLRKK 237 

+N+ +NG SSN ++N NNKN4G + + + + YD K 

Sbjct: 2898 NtJNDNNNDNSlTOGFVCELSSNINDFNNILNW-KDN^ 2956 

Query: 238 IL 239 
1 + 

Sbjct: 2957 IV 2958 
Score = 32.5 bits (72), Expect = 3.1 

Identities » 46/249 (18%), Positives = 101/249 (40%), Gaps - 31/249 (12%) 

Query: 9 YDFIKSELIKKGFNEFVTTONKLTFYDDEFQFMQKMLKFDKDVIAIVNEKVFKGFSLKDEL 68 

Y+++K ++ N N NK E Q++ K+ + + + +E K L++ 

Sbjct: 2150 YNYVK - - - VQNATNREDNKNK ERNLS QE I YKYINENI DLTS ELEKKNDMLENYK 2200 

Query: 69 SDL LFKKSFTIHFLDREINRQTVEAFGMQVITVCITHEDYLNWYSSSEVEKYL 122 

++L ++K +IL + M+++ N+ E+ + L 

Sbjct: 2201 NELKEKNEEIYKLNNDIDMLSNNCKKLKESIMMMEKYKIIMN NNIQEKDEIIENL 2255 

Query: 123 QSQGFTEHNEDTTSNTDETSNQNATS LDNSTGMTAN RNAYVSLPQSE VNIDV 174 

+++ + +D +N + ++S M+ + N + +1, +S N+D+ 

Sbjct: 2256 KNK-YNNKLDDLINNYSVVDKSIVSCFEDSNI^ 2314 

Query: 175 DNTTLRF ADNNT I DNGKTVNKS S N E SNQNAKRNQNQ KGNAKGTQ FT KQ YL I DN I D KA YDL 234 

N + ++I+N +N +N +N N N N N K YL++N+ D 

Sbjct: 2315 CNENMDSI- -SSINNVNNINNVNNINNV1WINNVNN 2372 

Query: 235 RKKILNEFD 243 
1+ +F+ 

Sbjct: 2373 DNIIIIKFN 2381 
Score = 32.1 bits (71), Expect =4.1 

Identities = 20/103 (19%), Positives « 48/103 (46%), Gaps = 2/103 (1%) 

Query: 115 SSEVEKYLQSQGFTEHNEDTTSNTDETSNQN - -ATSLDNSTGMTANRNAYVSLPQSEVNI 172 

+++ EKY EH + N D +N+N L ++ ++ + N S ++E+ 

Sbjct: 3264 NNDEEKYSCHDDKNEHTNNDLLNIDHDNNKNNI 3323 

Query: 173 D VDNTTLRFADNNTIDNGKTVN KS SKES NQNAKRNQNQKGNAK 215 

+ + D N ++ N ++E+++N + ++N + + K 
Sbjct: 3324 LISIDSSNENDENDENDENDENDENDENDENDENDENDENDEK 3366 

Score =30.9 bits (68), Expect * 9.2 

Identities * 27/118 (22%) , Positives = 53/118 (44%) , Gap3 = 15/118 (12%) 

Query: 104 THEDYLNWYSSSEV EKYLQSQGFTEHNEDTTSNTDETSNQNATSLDNSTGMTANR 159 

T+ D LN+ + +++ E Y HN+D ++ +E QN S+D+S N 

Sbjct: 3280 TmDLLNIDHDNNKNNITDELYSTYNVSVSHNKDPSKKENEI- -QNLISIDSSNENDEND 3337 " 

Query: 160 NAYVSLPQSEVNID VDNTTLRFADNNT I DNG KTVNKS SNESNQNAKRNQNQKGNAKGT 217 

+++ N + D D N ++ N +E+++N + ++N N +GT 

Sbjct: 3338 EN D END END END EN DENDENDENDENDEKDENDENDENDENFDNNNEGT 3386 
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>gi| 585795 |sp|P21538|REBl_YEAST DNA-BINDING PROTEIN REB1 (QBP) 

>gi|626139|pir| (S45907 DNA-binding protein REB1 - yeast 
(Saccharomyces cerevisiae) >gi { 536280 | emb | CAA84992 | 
{Z3S918) ORF YBR049C [Saccharomyces cerevisiae) 
>gi | 559944 |erab|CAA8639l| (Z46260) REB1 DNA-binding 
protein [Saccharomyces cerevisiae] 
Length = 810 

Score = 45.7 bits (106), Expect 3e-04 

Identities = 34/158 (21%) , Positives = 72/158 (45%) , Gaps = 14/158 (8%) 

Query: 83 DREINRQTVEAFGMQVITVCITHEDYLNW*SSSEVEKYLQSQGFTEH^ 142 

D+ N+++VE ++ + V + H+++ +++ K+ + Q E + D N++S 

Sbjct: 7 DKNANQES VE EAVLKYVGVGLDHQNHD PQLHTKDLENKH SKKQNI VESS SDVDVNNKDDS 66 

Query: 143 NQNATS LDNSTGMTANRNA YVS L PQS EVNI DVDNTTLRFADNNT ID NGKTVNKSSNE 199 

N+N + D+S ++A L +E + +VD+ N +D N+ +E 

Sbjct: 67 NRNEDNNDDSENISA LNANESSSNVDHANSNEQHNAVMDWYLRQTAHNQQDDE 119 

Query: 200 SNQNAKRNQNQKGNAKGTQFTKQYLIDNIDKAYDLRKK 237 

++N N GN F++ ++ +D D KK 

Sbjct: 120 DDEN- -NNNTDNGNDSNNHFSQSDIV- - VDDDDDKNKK 153 



>gi | 172372 (M58728) DNA-binding protein [Saccharomyces cerevisiae] 
Length = 809 

Score «= 45.7 bits (106). Expect » 3e-04 

Identities » 34/158 (21%), Positives = 72/158 (45%), Gaps = 14/158 (8%) 

Query: 83 DREINRQTVEAFGMQVITVCiraEDYLNVVYSSSEVT*! 142 

D+ N+++VB ++ + V + H+++ +++ K++Q E + D N ++ S 

Sbjct: 7 DKNANQESVEEAVLKYVGVGLDHQNHDPQLHTKDLENKHSKKQNIVESSNDVDVimNDDS 66 

Query: 143 NQNATS LDNSTGMTANRNA YVS L PQS EVN I D VDNTT LR FADNNT I D NGKTVNKSSNE 199 

N+N + D+S ++A L +E + +VD+ N +D N+ +E 

Sbjct: 67 NRNEDNNDDSENISA IJ4ANESSSNVDHANSNEQHNAVMDVTYLRQTAHNQQDDE 119 

Query: 200 S NQNAKRNQNQKGNAKG TQFTKQYLIDNID KA YD LRKK 237 

++N N GN F++ ++ +D D KK 

Sbjct: 120 DDEN- -NNNTDNGNDSNNHFSQSDIV- -VDDDDDKNKK 153 



>gi | 2952545 (AF051898) coronin binding protein (Dictyostelium 
discoideum] 
Length = 560 

Score « 44.9 bits (104), Expect « 6e-04 

Identities = 26/83 (31%), Positives = 39/83 (46%), Gaps « 5/83 (6%) 

Query: 131 NEDTTSNTD ETSNQNATS LDNSTGMTANRNAYVSLPQS EvNIDVDNTTLRFADNNT IDNG 190 

N + +N +N N+ S +NS +N N+ + P N D DN T +NNT +N 
Sbjct: 404 NNNNNNNI INNNNSNSNSNNNSNN -NSNNNSNRNS PNHNNNGDNDNNT NNNTNNNN 458 

Query: 191 KTVNKSSNESNQNAKRNQNQKGN 213 

N ++N +N N N N N 
Sbjct: 459 NNNNNNNNNNNNNNNNNNNNNNN 481 



Score = 41.4 bits (95), Expect = 0.006 

Identities » 22/88 (25%), Positives = 43/88 (48%), Gaps « 6/88 (6%) 

Query: 130 HNEDTTSNTDETSNQNATS LDN STGMTANRNAYVSLPQSEVNIDVDNTTLRFADNNT 186 

+ ++ +N++ SN N+ + +N + G AN++ + P + +N + DN +NN 
Sbjct: 337 NRNNSNNNSNNNSNNNSNNSNNRNITNGSNANKS NSPNNNLNTNNDNKNNNSNNNNN 393 

Query: 187 I DNGKTVNKS SNESNQNAKRNQNQ KGNA 214 

+N S+N +N N N N N+ 

Sbjct: 394 SNNNSNNGNSNNNNNNNIINNNNSNSNS 421 



Score = 40.6 bits (93), Expect = 0.011 

Identities * 24/101 (23%), Positives = 41/101 (39%), Gaps « 2/101 (1%) 

Query: 115 SSEVEKYLQSQGFTEHNEDTTSNTDETSNQNATSLDNSTGMTANRNAYVSLPQSEVNIDV 174 
S+ L + ++N +N ++ N S +N+ N N S + N + 
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Sbjct: 370 SNSPNNNLNT^J^^)NKlnfNSNMNNNSNNNS^INGNSNNNKNN^II INNNNSNSNSNNNSNNNS 429 

Query: 175 DNTTLRFADN- -NTIDNGKTVNKSSNESNQNAKRNQNQKGN 213 

+N + R + N N DN N ++N +N N N N N 
Sbjct: 430 NNNSNRNSPtmNNNGDNDNNTNNNTNNNimNNNNNNNNNNN 470 

Score s 40.2 bits (92), Expect = 0.014 

Identities = 21/80 (26%) , Positives = 39/80 (48%) , Gaps = 9/80 (11%) 

Query: 130 HNEDTTSNTD ETSNQNATSLDNSTGMTANRNA WSLPQS EVNI D VDNTTLRFADNNTI DN 189 

+N D +NT+ +N N + +N+ N N N + +N +ADN+ ++ 

Sbjct: 442 NNGDNDNNTNNNTNNNNNNNNNNNNNNNNNNN - -NNNNNNNNNNYADNSNNNS 492 

Query: 190 GKTVNKSSNESNQNAKRNQN 209 

+ N +SN +N N +N+N 
Sbjct: 4 93 SNSNNNNSNSNNNNDNKNEN 512 

Score =i 39.5 bits (90), Expect » 0.024 

Identities « 26/111 (23%), Positives » 44/111 (39%), Gaps n 20/111 (18%) 

Query: 112 VYS SS EVEKYLQSQ - - GFTEHNEDTTSNTDETSNQNATS LDNSTGMTANRNAYVS LPQSE 169 

VY + K+ ++ G +N ++ +N++ SN N ++N N N 
Sbjct: 296 VYCTHHHTKE^ETHRWGLI^INNNNSNNNSNSNSNNNNNGINNRNNSNNNSN 346 

Query: 170 VNIDVDNTTLRFADNNTIDNGKTVNKSS NESNQNAKRNQNQKGNA 214 

+ N ++N I NG NKS+ N +N " N N N N+ 

Sbjct: 347 NNSNNNSNNSNNRNITNGSNANKSNS PNNNLNTNNDNKNNNSNNNNNS 394 

Score » 37.5 bits (85), Expect = 0.094 

Identities ■ 24/96 (25%), Positives « 41/96 (42%), Gaps = 1/96 (1%) 

Query: 124 SQGFTEHNEDTTSNTDETSNQNATSIJDNSTGM-TAin^AYVSLPQSEVNIDVDNTTLRFA 182 

S + +N + SN + ++ N DN+T T N N + +N++N 
Sbjct: 421 SNNNSNNNSNNNSNRNSPNHNNNGDNDNNTNNNTNNN^ 480 

Query: 183 DNNTIDNGKTVNKSSNESNQNAKRNQNQKGNAKGTQ 218 

+NN DN + +SN +N N+ N + K Q 
Sbjct: 481 NNNYADNSNNNSSNSNNNNSNSNNNNDNKNENSDNQ 516 

Score = 35.6 bits (80), Expect « 0.36 

Identities * 25/99 (25%) , Positives => 42/99 (42%) , Gaps a 18/99 (18%) 

Query: 130 HNEDTTSNTDETSNQNATSLDNST-GOTANRNAYVSLPQSEWIDV13NTTLRFADNNTID 188 

+N + SN + +N N ++ N T G AN++ + P + +N + DN +NN + 

Sbjct: 339 NNSNNNSNNNSNNNSNNSNNRNITNGSNANKS NS PNNNLNTNNDN KNNNSNNNNNSN 395 

Query: 189 NGKTV- NKSSNESNQNAKRNQNQKGN 213 

N N S++ SN N+ N N N 

Sbjct: 396 NNSNNGNSNNNNNNNIINNNNSNSNSNNNSNNNSNNNSN 434 

Score o 35.2 bits (79), Expect » 0.47 

Identities = 21/94 (22%) , Positives o 42/94 (44%) , Gaps = 5/94 (5%) 

Query: 124 SQGFTEHNEDTrSbm3ETSNQNATSLDNSTGMTANRNAYVSLPQSEVNIDVDNTTLRFAD 183 

+ G + ++ +N T+N N + N+ N N+ + N + +N + + 

Sbjct: 362 TNGSNANKSNS PNNNLNTNNDNKNNNSNN NNNSNNNSNNGNSNNNNNNNI INNNN 416 

Query: 184 NNTIDNGKTVNKSSNESNQNAKRNQNQKGNAKGT 217 

+N+ N + N S+N SN+N+ + N N T 
Sbjct: 417 SNSNSNNNSNNNSNNNSNRNSPNHNNNGDNDNNT 450 

Score ■ 35.2 bits (79), Expect = 0.47 

Identities = 29/118 (24%) , Positives =. 53/118 (44%) , Gaps * 12/118 (10%) 

Query: 115 SSEVEKYLQS-QGFTEHNEDTTSMTDETSNQNATSLDNSTGMTANRNAYVSLPQSEVNID 173 

SS+ E ++ +GF + + T+N ++N D S+G + + + V+ P+S +N 

Sbjct: 114 SSDSEADIEDDKGFQD- -KPITTNNSGSNNPLKNLKDYSSGSSGSSRSGVNQPRSNINNS 171 

Query: 174 VDNTTLRFADNNT IDNGKTVNKSSNESNQNAKRNQNQKGNAKGTQFTKQ 222 

D + + +N+ I + T + NQN +NQNQ N Q +Q 
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Sbjct: 172 mjKYKSKSSSSNSNSSSSGGSLISSLLTGGNTYQNQNQNQNQNQNQNNNQSQLQQQQQ 229 
Score b 34.4 bits (77), Expect « 0.81 

Identities = 24/94 (25%) , Positives = 38/94 (39%) , Gaps *> 12/94 (12%) 

Query: 131 NEDTT S NTD ETSNQNAT S LDNSTGKT ANRNA YVS L P Q S E VN I D VDNTTLR F ADNNTI DNG 190 

N +T +N + +N N + +N+ N N S N N +NN+ N 

Sbjct: 4 51 NNNTNNNNNNNNNNNNNNNNNNNNNNNNNNNN^ NNNSNSNN 504 

Query: 191 KTVNKS SNESNQNAKR NQNQKGNAKGTQ 218 

NK+ N NQ+ R ++NQK + Q 

Sbjct: 505 NNDNKNENSDNQSVLRSNEKFTDENQKNGSDDQQ 538 



Score = 33.6 bits (75), Expect =1.4 

Identities = 22/90 (24%) , Positives = 35/90 (38%) 

Query: 124 SQGFTEHNEDTTSNTDETSNQNATSLDNSTGMTANI^AyVSLPQSEvNTDVDNTTLRFAD 183 

S N SN +++++ N N+ NK + +N++N 

Sbjct: 353 SNNSNNRNITNGSNANKSNSPNNNLNTNNDNKNNN^ I 412 

Query: 184 NNTI DNGKTVNKS SNESNQNAKRNQNQKGN 213 

NN N + N S+N SN N+ RN N 
Sbjct: 413 NNNNS NSNSNNNSNNNSNNNS NRNS PNHNN 442 



>gi|535260|emb|CAA82996| (230339) STARP antigen (Plasmodium 
reichenowi] 
Length = 655 

Score ~ 44.5 bits (103), Expect = 7e-04 

Identities «■ 31/114 (27%) , Positives « 47/114 (41%), Gaps = 14/114 (12%) 

Query: 128 TEHNEOTTSNTDBTSMQNATSLDNSTGMTANRKAYVS LPQSEVN IDVDNTTLRF 181 

T++N T TD + + +N+T A N + ++ N D +NT + 

Sbjct: 433 TDNNNTNTKATDSNNTNTKATDNNNTNTKATDNNNT^ 492 

Query: 182 ADNNTI DNGKTVNKSSNESNQNAKRNQNQKGNAKGT QFTKQYLIDN 227 

DNN DN T K+++ +NNK N NKT T QY+ N 

Sbjct: 493 TDNNNTNTKATDNNNTNT KATDNNNTNT KATDNNNTIHTCATDNNNNTNQYVFAN 546 



Score = 44.5 bits (103), Expect o 7e-04 

Identities - 30/103 (29%), Positives » 44/103 (42%), Gaps = 13/103 (12%) 

Query: 128 TEHKEDTTSNTD ETSNQNATS LDNS TGMTANRNAYVS LPQSEVN IDVDNTTL 179 

T++N T TD+++N + + DN+ T T N N S D +NT 

Sbjct: 401 TDNNNTDTKATDKSNNTDTKATDNNNNTDTK^ 460 

Query: 180 RFADNNTI DNGKTVNKSSNESNQNAKRNQNQKGNAKGT 217 

+ DNN DN T K+++ +N N K N NKT 

Sbjct: 461 KATDNNNTNTKATDNNNTNTKATDNNNTNTKATDNNNTNTKAT 503 



Score = 42.6 bits (98), Expect = 0.003 

Identities » 27/96 (28%), Positives = 43/96 (44%), Gaps = 10/96 (10%) 

Query: 128 TEHNEDTTSNTDETSNQNATS LD - NSTGMTANRNA YVSLPQS EVNIDVDNTTLRFADNNT 186 

T++N +T + + +N N + D N+T AN + ++ N NT + DNN 
Sbjct: 422 TDNNNNTDTKATDNNNTNTKATDSNNTNTKATDNNNTOT NTNTKATDNNN 477 

Query: 187 I DNGKTVNKSSNESNQNAKRNQNQKGNAKGT 217 

DN T K+++ +N N K N NKT 
Sbjct: 478 TNTKATDNNNTNTKATDNNNTNTKATDNNNTNTKAT 513 



Score = 41.8 bits (96), Expect e 0.005 

Identities = 3S/1S0 (23%), Positives = 59/150 (39%), Gaps * 9/150 (6%) 

Query: 85 EINRQTVEAFGMQVITVCITHEDYLNVVYSSSEVEKYLQSQGFTEraNE 144 

E N+ ++ G T+ + N + E + +Q T +N TT+ + N 

Sbjct: 118 ETNKTN I KLTGNNSTT I NTNLT ENTNA - - TKKLT ENV I TNQ I LTGNNNTTTNTS STEHNN 175 

Query: 145 NATSLDNSTGMTANRNAyVSLPQS EVNIDVDNTTLRFADNNT I DNG KTVNKS SNESNQNA 204 
N + NSTG T+ NI + N L +N T + T + ++ +N N+ 



WO 00/32825 



PCT/IB99/02040 



305 

Sbjct: 176 NINTNTNSTGNTSTTKKLTE NI - ITNQILTGNNNTTTNTSSTEHNNNINTNTNS 228 

Query: 205 KRNQNQKGNAKGTQFTKQYL IDN I DKAYDL 234 

N N N T + DNI+ +L 

Sbjct: 229 TDNSNTNTNLTOITTTTKKWTDNINTTQNL 258 

Score = 41.8 bits (96), Expect = 0.005 

Identities = 30/101 (29%), Positives = 43/101 (41%), Gaps = 13/101 (12%) 

Query: 130 HNEDTTSNTDETSNQNATSLDNS-TGMTANRNAYVSLPQSEVNIDV DNTTLRFA 182 

+N DT S ++ ++ AT DN+ T T N N + N D +NT + 

Sbjct: 363 NNTDT I STDNDNTDTKATDNDNTDTKATDNNNNTDTKATDNNNTDTK^ 422 

Query: 183 DNN TIDNGKTVNKSSNESNQNAKRNQNQKGNAKGT 217 

DNN DN T K+++ +N N K N N K T 

Sbjct: 423 DNNNNTDTKATDNNNTNTKATDSNNTbH^ 463 

Score s 40.6 bits (93), Expect =» 0.011 

Identities « 31/121 (25%), Positives « 47/121 (38%), Gaps * 31/121 (25%) 

Query: 128 TEHNEDTTSNTDETSNQNAT SLDNSTGMTANRNAYVSLPQSEVN 171 

TEHN + +NT+ TN+T ++++TNN + +EN 

Sbjct: 171 TEHNNNINTNTNSTGNTSTTKKLTENI ITNQILTGNNNT^ 230 

Query: 172 1 DVDHTTLRPADN NT I DNG KTVNKS S N E S NQN AKRNQNQKGNAKG 216 

D+ TT ++ DN T N TV+ +N +N N K N N K 

Sbjct: 231 NSNTtmiLTOITTTTKKWTONINTTQNLT^ 290 

Query: 217 T 217 
T 

Sbjct: 291 T 291 
Score = 38.3 bits (87), Expect = 0.055 

Identities = 28/98 (28%) , Positives « 41/98 (41%) , Gaps *> 10/98 (10%) 

Query: 128 TEHNEDTTSNTDETSHQNATSLDNSTGMTANRNAYVSLPQSEVNIDVD - NTTLRFADNNT 186 

TEHN + +NT+ S N+ + N T +T 4 + N+ NTT DNN 

Sbjct: 216 TEHNNNINTNTN- - STDNSNTNTNXiTDITTTTKKWTDNINTTQNLTTSTNTTTVSTDNNN 273 



Query: 187 IDNGKTVNKSSNESNQNAKRNQNQKGNAKGT 217 

DN T KS++ N K N+ + K T 
Sbjct: 274 NNINTKPTDNNNTNIKSTDNYNTGTKETDNKNTDIKAT 311 

Score = 37.5 bits (85), Expect = 0.094 

Identities = 31/106 (29%), Positives = 45/106 (42%), Gaps « 18/106 (16%) 

Query: 128 TEHNEDTTSNTDBTSNQN ATSLDNSTGMTANRNAYVSLPQSEVN IDVDN 176 

T++N +T +T T N N AT N+T AN + ++ N D +N 

Sbjct: 390 TDNNNNT--DTKATDNNNTDTKATDKSNNTDTKATDNNNNTTJTKATO 447 

Query: 177 TTLRFADNN T I DNG KTVN KS S NE S NQN AKRNQNQKGNAKG T 217 

T + DNN DN T K+++ +N N K N N K T 

Sbjct: 448 TNTKATDNNNTNTKATDNNNTNTKATDNNNTNT KATDNNNTNTKAT 493 

Score «= 35.2 bits (79), Expect = 0.47 

Identities = 24/109 (22%), Positives = 46/109 (42%), Gaps = 6/109 (5%) 

Query: 128 TEHNEDTTSNTDETSNQNATS LDNS TGMTANRNAYVS LPQS EVN IDVDNTTLRF 181 

T++N T TD + + +N+T AN + ++ N D +NT + 

Sbjct: 473 TDNNNTNTKATDNNNTNTKATDNNNTNTKATDNNNTNTKATO 532 

Query: 182 ADNNTIDNGKTVNKSSNESNQNAKRNQNQKGNAKGTQFTKQYLIDNIDK 230 

DNN N + +E+ + K N++ N++ + K + +DK ' 

Sbjct: 533 TDNNNNTNQYVFANNYDETTSDDKLNKDSCDNSEEKENIKSMINAYLDK 581 



Score =34.4 bits (77), Expect = 0.81 

Identities * 26/126 (20%), Positives « 46/126 (35%), Gaps = 7/126 (5%) 
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Query: 


99 


I TVC I THEDYLNVVYSS S EVEKYLQSQG FTEHNEDTTSNTD ETSNQNATSLDNSTGMTAN 


158 






IT T+ + ++ S + V S T +++ +N T N N ++ T 




Sbjct: 


318 


ITTDNTNTNVI STDNSKTNVI SKDNSNTHTI STDNSKTNVI STDNNNTDTI STDNDNTDT 


377 


Query: 


159 


RNA YVS LPQS EVK I DVDNTTLR F ADNNT I D NG KTVNKS SNE S NQNAKRNQNQ K 


211 






+ ++ + +NT + DNN D N +K+N + KN 




Sbjct: 


378 


KATONDNTDTKATDNNNNTDTKATDNNNTDTKATDKSNN^ 


437 


Query: 


212 


GNAKGT 217 








N K T 




SbjCt: 


438 


TNTKAT 443 





Score * 34.4 bitB (77) , Expect = 0.81 

Identities =» 30/100 (30%) , Positives * 44/100 (44%) , Gaps = 14/100 (14%) 

Query: 131 NEDTTSNTDETSNQNATSLDNS-TGOTANRNAY- --VSLPQSEVNI DVDNTTLRFAD 183 

N + T TDTNN S DNS T + + N+ +S S+ N+ D +NT D 
Sbjct: 313 NNNITITTDNT-NTNVISTDNSKTNVTSKD^ 371 

Query: 184 NNTIDNG KTVNKS S NESNQNAKRNQNQKGNAKGT 217 

N+D TN++ N+N + K N +KT 

Sbjct: 372 NDNTDTKATDNDNTDTKATDNNNNTDTKATDNNNTDTKAT 411 



Score = 34.4 bits (77), Expect a 0.81 

Identities = 28/101 (27%), Positives = 41/101 (39%), Gaps = 15/101 (14%) 

Query: 131 NEDTTSNTDETSNQNATSLDNSTGMTA--NRNAYVSLPQSEVNIDV DNTTLRFA 182 

N DT + ++ ++ AT +N+T ANN N D +NT + 

Sbjct: 374 NTDTKATDNDNTDTKATDNNNNTDTKATDNNNTDT^ 433 

Query: 183 DNNTIDNGK TVNKSSNESNQNAKRNQNQKGNAKGT 217 

DNN N K T K+++ +N N K N N K T 

Sbjct: 434 DNNN-TNTKATDSNNTNTKATDNNNTNTKATDNNNTNTKAT 473 



Score a 32.5 bits (72), Expect » 3.1 

Identities « 30/110 (27%), Positives « 40/110 (36%), Gaps » 23/110 (20%) 

Query: 131 NEDTTSNTDETSNQNATS LDNS TGMTANRNAYVSLPQS EVN I D VDNTTLRF 181 

N +TT N ++N S DN+ TTNN+ + DNT++ 

Sbjct: 251 NINTTQNLTTSTNTTTVSTDNNNNNINTKPTDNNNTNIKSTDNYNTO 310 

Query: 182 ADNNT I DNG KTVNKS SNESNQNAKRNQNQKGNAKGT 217 

DNN I DNKTS+SN+ NKNT 

Sbjct: 311 TDNNNITITTDNTNTNVISTDNSKTNVISKDNSNTHTISTDNSKTNVIST 360 



>gi|l429240|emb|CAA67659| (X99260) lower collar protein 
{Bacteriophage B103) 
Length = 293 

Score « 43.8 bits (101), Expect = 0.001 

Identities = 53/204 (25%), Positives = 79/204 (37%), Gaps « 42/204 (20%) 



Query: 


56 


EKVFKG FSLKDELSDLLFKXSFTIHFLD RE INRQTVEAFGMQV I TVC ITHED 


107 






EK+ KG F + + D ++K F HF+ REI +T F + T I + 




Sbjct: 


26 


EKIEKGRPKLFDFQYPIFDESYRKVFETHFIRNFYMREIGFETEGLFKFNLETWLIINMP 


85 


Query: 


108 


YLNWYSSSEVEKY LQSQGFTEH NEDTT SNTDETSNQNA 


146 






Y N ++ S E+ KY L + G ++ N DTT SNT + NA 




Sbjct: 


86 


YFNKLFES-ELIKYDPLENTRLNTTGNKKNDTERNDNRDTTC 


144 


Query: 


147 


TSLDNSTGMTA NRNAYVSLPQSEVNIDVDN- -TTLRFADNNTIDNG KTVNKS 


196 






T G T NR P S +N+ ++ TL +A + 1+ T NK 




Sbjct: 


145 


TGSSKEDGKTTGSVTDDNFNRKIDSDQPDSRIjNLTTNDGQGTXEYA- -SAIEE 


202 


Query: 


197 


SNESNQNAKRNQNQKGNAKGTQFT 220 








+ N + + GT T 




Sbjct: 


203 


NTTGTNNVTS S AES ESTGSGTSDT 226 





Query= pt| 110879 44AHJDORF009 Phage 44AHJD ORF | 5744-6496(2 1 
(250 letters) 
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>gi|276498l|emb|CAA69021.l| (Y07739) N-acetylrauraraoyl-L-alanine 
amidase [Staphylococcus phage TwortJ 
Length = 467 

Score = 180 bits (452) , Expect = Ie-44 

Identities = 89/157 (56%) , Positives » 109/157 (68%) , Gaps » 8/157 (5%) 

MKS QQQAKEW I YKH EGAGVD FDGA YG FQCMD LS VA YVYY I T DG KVRMWGN AKDAI NND F K 6 0 
MK+ +QA+ +1 G DFDG YG+QCMDL+V Y+Y++TDGK+RMWGNAKDAINN F 

MKTLKQAESYIK^KVNTGTDFDGLYGYQCKDIAVDYIYHVTIXSKIRMWGNAKDAINNSFG 60 



G ATVYKN P+F+P+ GDV V+T G YGHI V + G+L Y T LEQNW G G 



Query: 


1 


Sbjct: 


1 


Query: 


61 


Sbjct: 


61 


Query: 


114 


Sbjct: 


121 



E ATIRTH Y G+THFIRP F+ +S K +T K 



Score = 61.7 bits (147), Expect * 6e-09 

Identities = 41/125 (32%), Positives » 57/125 (44%), Gaps = 8/125 (6%) 

Query: 125 YYDGVTHFIRPKFSGSNSKAI^TSKVNTFGKWKRNQYGTYYRNENGTFTC-GFLPIFARV 183 

YY+G T P +K + +T G W N YGTYY++E+ TF C I R 

Sbjct: 346 YYEGKTPV- - PTWNQKAICrKP VKQS STSG - WNVNNYGTYYKSES ATFKCTARQGIVTRY 402 

Query: 184 GSPKLSEPNGYWFQPNGYTPYNEVCLSIXSYWIGYhWOGTR-YYLPVRQWNGKTGNSYSV 242 

P + P Y+ VC DGYVWI + G + ++PVR W+ N+ + 
Sbjct: 403 TG PFTTCPQAGVLYYGQS VTYDTVCKQDG YVW I SWTTNGGQDVWMPVRTWD KNTD I M 459 

Query: 243 GIPWG 247 

G WG 
Sbjct: 460 GQLWG 464 



>gi 1 113675 j sp | P24556 | ALYS_STAAU AUTOLYSIN 

(N- ACETYLMURAMOYL- L- ALANINE AMIDASE) 

>gi|79887|pir| | JQ1147 N-acetylmuramoyl-L-alanine amidase 
(EC 3.5.1.28) - Staphylococcus aureus >gi| 153067 
(M76714) peptidoglycan hydrolase [Staphylococcus aureus] 
Length = 481 

Score = 118 bits (292), Expect * 6e-26 

Identities « 56/117 (47%), Positives = 68/117 (57%), Gaps « 1/117 (0%) 

Query: 135 PKFSGSNSKALETSKVNTFGK-WKRNQYGTYYRNENGTFTCGFLPIFARVGSPKLSEPNQ 193 

P + SN + ++ V WKRN+YGTYY E+ FT G PI R P LS P G 

Sbjct: 365 PVATVSNES SAS SNTVK PVAS AWKRNKYGTYYMEESARFTNGNQPI TVRKVG PFLS C PVG 424 

Query: 194 YWFQPNGYTPYNEVCIiSIXJYWIGYNWQ^TRYYLPVRQWNGKTGNSYSVGIPWGVFS 250 

Y FQP GY Y EV L DG+VW+GY W+G RYYLP+R WNG + +G WG S 
Sbjct: 42S YQFQPGGYCDYTEVill^IXSHvWGYTWEGQRYYLPIRTWNGSAPPNQILGDLWGEIS 4 81 



Score * 78.0 bits (189), Expect * 7e-14 

Identities = 48/109 (44%), Positives » 62/109 (56%), Gaps = 6/109 (5%) 

Query: 15 EGAGVT3FDGAYGFQCMDLSVAYVYYITDGKVRMWGKAKDA- INNDFKGLATVYKNTPSFK 73 

EG + D YGFQC D + A + + G + AKD N+F GLATVY+NTP F 

Sbjct: 18 EG KQ FNVD LWYG FQC FDY AN AG - WKVL FG LLL KG LG AJCD I P F ANN FDG LATVYQNT PD FL 76 

Query: 74 PQ LGD VAVYTNG Q YGHI QCVLSGNLDYYTC LEQNW LGGGF - DGWEK 118 

Q GD+ V+ + YGH+ V+ LDY EQNWLGGG+ DG E+ 

Sbjct: 77 AQPGDMVVTGSNYGAGYGHVAWVIEATLDYIIVYEQNWIjGGGWTDGIEQ 125 



>gi| 1763243 (U72397) amidase [bacteriophage 80 alpha) 
Length = 481 

Score « 118 bits (292), Expect * 6e-26 

Identities = 56/117 (47%), Positives = 68/117 (57%), Gaps = 1/117 (0%) 



Query: 135 PKFSGSNSKALETSKVNTFGK-VnCRNQYGTYYRNEKGTFTCGFLPIFARVGSPKLSEPNG 193 

P t- SN + ++ V WKRN+YGTYY E+ FT G PI R P LS P G 

Sbjct: 36S PVATVSNESSASSNTVKPVASAWIOINKYGTYYMEESARFTNGNQPITTOKVGPFLSC 424 
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Query: 194 YWFQPNGYTPYNEVCLSDGYVWIGYNWQGTRYYLPVRQWNGKTGNSYSVGIPWGVFS 250 

Y FQP GY Y EV L DG+VW+GY W+G RYYLP+R WNG + +G WG S 
Sbjct: 425 YQFQPGGYCDYTEVMLQDGHVWVGYTWEGQRYYLPIRTWNGSAPPNQILGDLWGEIS 4 81 



Score = 83.5 bits (203), Expect - 2e-15 

Identities =. 50/115 (43%), Positives » 65/115 (56%), Gaps * 6/115 (5%) 

Query: 9 EW I YKHEGAGVD FDGAYG FQCMDLS VAYVYY I TDGKVRMWGNAKDA - INNDFKG LATVYK 67 

EW+ EG + D YGFQC D + A + + G + AKD N+F GLATVY+ 

Sbjct: 12 EWLKT S EG KQ FNVD LWYG FQC FD YANAG - W KVLFG LL L KG LGA KD I P FANN FDGLATVYQ 70 

Query: 68 NTPSFKPQLGDVAVYTNGQ YGHIQCVLSGKLDYYTCLEQNWLGGGF-DGWEK 118 

NTP F Q GD+ V+ + YGH+ V+ LDY EQNWLGGG+ DG E+ 

Sbjct: 71 NTPDFI^QPGDMWFGSNYGAGYGHVAWIEAT1I)YIIVYEQNWLGGGWTDGIEQ 125 



>gi| 4574237 |gb|AAD23962.l|AF106851_l (AF1068S1) LytN [Staphylococcus 
aureus] 
Length 383 

Score a 84.3 bits (205), Expect = 9e-16 

Identities « 48/128 (37%) , Positives * 68/128 (52%) , Gaps * 7/128 (5%) 

Query: 15 EG AG VD FDGAYG FQCMD LS VAYVYY I TDGKVKMWGNAKDAI NND F KG LATVYKNT P S F K P 74 

E G DFDG+YG+QC DL Y ++ ++ +G N+F A +Y NTP+FK 

Sbjct: 252 ENRGWD FDG S YGWQC FD LVNVYWNH L YGHG LKGY GAKD I P YANNFN S EAKI YHNT PTFKA 311 

Query: 75 QLGDVAVYT NGQYGHIQCVLSGNLD YYTCLEQNWLGGG FDGWEKAT I RTHYYD 127 

+ GD+ V++ G YGH VL+G+ D + L+QNW GG+ E A H Y+ 

Sbjct: 312 EPGDLVVFSGRFGGGYGHTAIVIiNGDYDGKLMKFQSLDQNWNM 371 

Query: 128 GVTHFIRP 13 S 
FIRP 

Sbjct: 372 NDMIFIRP 379 



>gi (3767593 | dbjjBAA33 856.1 1 (AB015195) LytN (Staphylococcus aureus] 
Length » 383 

Score = 84.3 bits (20S), Expect « 9e-16 

Identities » 48/128 (37%), Positives » 68/128 (52%), Gaps = 7/128 (5%) 

Query: 15 EGAG VD FDG AYG FQCMDIiS VA YVYYI TDG KVRMWGN AKDA I NND FKG LATVYKNT PS FK P 74 

E G DFDG+YG+QC DL Y ++ ++ +G N+F A +Y NTP+FK 

Sbjct: 252 ENRGWD FDG SYGWQCFDLVNVYWNHLYGHGLKGYG AKD I PYA 311 

Query: 75 QLGDVAVYT NGQYGHIQCVLSGNLD YYTCLEQNWLGGG FDGWEKAT I RTHYYD 127 

+ GD+ V++ G YGH VL+G+ D + L+QNW GG+ E A H Y+ 

Sbjct: 312 E PGDLWFS GRFGGGYGHTAI VLNGDYDGKLMKFQS LDQNWNNGGWRKAE VAHKVVHNYE 371 

Query: 128 GVTHFIRP 135 
FIRP 

Sbjct: 372 NDMIFIRP 379 



>gi|2764983|emb|CAA69022.l| (Y07740) cell wall hydrolase Plyl87 
[Staphylococcus phage 187] 
Length = 628 

Score « 76.9 bits (186), Expect « 2e-13 

Identities » 50/144 (34%), Positives = 68/144 (46%), Gaps « 18/144 (12%) 

Query: 5 QQAKEWIYKHEGAGVDFDGAYGFQCMDLSVAYVYYITDGKVRMW GNAKDAINNDF 59 

+Q +W G+GVD DG YG QC DL Y++ R W GNA+D + 

Sbjct: 12 KQWDWAINLIGSGVDVDGYYGRQCWDLP-NYIFN RYWNFKT PGNARDMAWYRY 64 

Query: 60 KGLATVYKNT PS FKPQLGDVAVYTNGQY GHIQCVLS-GNLDYYTCLEQNWLGGGF 113 

V++NT F P+ GD+AV+T G Y GH V+ Y+ ++QNW 

Sbjct: 65 PEX3FKVFRNTSDFVPKPGDIAVWTGGNYNWNTWGHTGIWGPSTKSYFYS 124 

Query: 114 DGWEKATIRTHYYDGVTHFIRPKF 137 

A H Y GVTHF+RP + 
Sbjct: 125 YVGSPAAKIKHSYFGVTHFVRPAY 148 
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>gi|3287732|sp|O05156|ALEl_STACP GLYCYL-GLYCINE ENDO PEPTIDASE ALE-1 
PRECURSOR >gi | 1890068 |dbj | BAA13069 | (086328) ALE-1 
(Staphylococcus capitis] 
Length ° 362 

Score « 73.4 bits (177), Expect = 2e-12 

Identities * 47/117 (40%), Positives e 61/117 (51%), Gaps » 10/117 (8%) 

Query: 132 FIRPKFSGSNSKALETSKVNTFGKWKRNQYGTYYRNENGTFTrcFLPIFARVGSPKLSEP 191 

F++ GSNS TS N G +K N+YGT Y++E+ +FT I R+ P S P 

Sbjct : 252 FLKSAGYGSNS TSSSNNNG-YKTNKYGTLYKSESASFTAN-TDIITRLTGPFRSMP 305 

Query: 192 NGYWFQPNGYTPYNEVCLSDGYVWIGYNW-QGTRYYLPVRQWNGKTGNSYSVGIPWG 247 

+ Y+EV DG+VW+GYN G R YLPVR WN TG +G WG 
Sbjct: 306 QSGVLRKGLTI ICH5EVMKQDGHVWVGYNTNSGKRVYLPVRTWNESTG ELGPLWG 359 



>gi|79926|pir| |A25881 lysostaphin precursor - Staphylococcus 

simulans >gi| 153047 (M15686) lysostaphin (ttg start 
codon) [Staphylococcus simulans} 
Length o 389 

Score « 69.5 bits (167), Expect = 3e-ll 

Identities » 48/133 (36%) , Positives » 62/133 (46%) , Gaps « 20/133 (15%) 

Query: 131 HFIRPKFSGSNSKALETS KVNTFGK --WKRNQYGTYYRNENGTFTCG 175 

HF R S SNS A + K +GK WK N+YGT Y++E+ +FT 

Sbjct: 258 H FQRMVNS FSNSTAQD PM PFLKS AG YGKAGGTVTPTPNTGWKTNKYGTLYKS ESAS FT PN 317 

Query: 176 FLPIFARVGSPKLSEPNGYWFQPNGYTPYNEVCLSDGYVWIGYNW-OGTRYYLPVRQWNG 234 

I R P S P + Y+EV DG+VW+GY G R YLPVR WN 

Sbjct: 318 -TDIITRTTGPFRSMPQSGVUCAGQTIHYDEVMKQDGHVWVGYTGNSGQRIYLPVRTWNK 376 

Query: 235 KTGNS YSVG I PWG 247 

T ++G+ WG 
Sbjct: 377 STN TLGVLWG 386 



>gi|l26496|sp|P10S48|LSTP_STAST LYSOSTAPHIN PRECURSOR 

(GLYCYL-GLYCINE ENDOPEPTIDASE) >gi| 79927 |pir| (S01079 
lysostaphin precursor - Staphylococcus simulans bv. 
staphylolyticus >gi } 581744 j emb | CAA294 94 | (X06121) 
lysostaphin (AA 1-480) [Staphylococcus simulans bv. 
ataphyloly t icus ) 
Length = 480 

Score = 69.5 bits (167), Expect * 3e-ll 

Identities = 48/133 (36%) , Positives « 62/133 (46%) , Gaps *» 20/133 (15%) 

Query: 131 HFIRPKFSGSNSKALETS KVNTFGK WKRNQYGTYYRNENGTFTCG 175 

HF R S SNS A + K +GK WK N+YGT Y++E+ +FT 

Sbjct: 349 HFQRMVNSFSNSTAQDPMPFLKSAGYGICAG<nVTPTPNTGWKTNKYGTLYKSESASFTPN 408 

Query: 176 FLPIFARVXSSPKLSEPNGYWFQPNGYTPYNEVCLSDGYVWIGYNW-0/5TRYYLPVRQWNG 234 

I R P S P + Y+EV DG+VW+GY G R YLPVR WN 

Sbjct: 4 09 -TDIITRTrcPFRSHPQSGVLKAGQTIHYDEVMKQDGHVWGYTGNSGQRIYLPVRTWNK 467 

Query: 235 KTGNS YSVG I PWG 247 

T ++G+ WG 
Sbjct: 468 STN TLGVLWG 477 



>gi|3287967|sp|P10547|LSTP_STASI LYSOSTAPHIN PRECURSOR 

(GLYCYL-GLYCINE ENDOPEPTIDASE) >gi| 2072411 (U66883) 
lysostaphin [Staphylococcus simulans] 
Length « 493 

Score = 69.5 bits (167), Expect « 3e-ll 

Identities * 48/133 (36%) , Positives = 62/133 (46%) , Gaps = 20/133 (15%) 

Query: 131 HFIRPKFSGSNSKALETS KVNTFGK WKRNQYGTYYRNENGTFTCG 175 

HF R S SNS A + K +GK WK N+YGT Y++E+ +FT 

Sbjct: 362 HFQRMVNSFSNSTAQDPMPFLKSAGYGKAGGTVTPTPNTGWKTNKYGTLYKSESASFTPN 4 21 



Query: 176 FLPIFARVGSPKLSEPNGYWFQPNGYTPYNEVCLSDGYVWIGYNW-QGTRYYLPVRQWNG 234 
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I R P S P + Y+EV DG+W+GY G R YLPVR WN 

Sbjct: 422 -TDIITRTTGPFRSMPQSGVLKAGQTIHYDEVMKQDGHVWVGYTGNSGQRIYLPVRTWNK 4 80 

Query: 235 KTGNSYSVGIPWG 247 

T ++G+ WG 
Sbjct: 4 81 STN TLGVLWG 490 



>gi | 3341932 (dbj | BAA31898.lt (AB009866) ami da s e (peptidoglycan 
hydrolase) [bacteriophage phi PVL] 
Length = 4 84 

Score o 68.3 bits (164), Expect = 6e-ll 

Identities » 52/150 (34%), Positives » 71/150 (46%), Gaps = 17/150 (11%) 



Query: 


3 


S Q QQ AKEW I YKHEG AG VD FDGA YG FQCMD LS VA YVYY I TDG KVRMWGNAKDAI NND F KG L 


62 






++ QA++W G + D YGFQC D++ + IG+R+G IDK 




Sbjct: 


4 


TKNQAEKWFDNSLGKQE^PDLFYGFQCYDYASMF-FMIATGE-RLQGLYAYNIPFDNKAR 


61 


Query: 


63 


ATVY KNTPSFKPQLGDVAVYTN GQYGHIQCVLSGNLDYYTCLEQNWLGGGF- - 


113 






Y KN SF PQ D+ V+ + G GH++ V S NL+ +T QNW G G+ 




Sbjct: 


62 


IEKYGQI IKNYDSFLPQKLDIWFPSKYGGGAGHVE I VESANLNTFTSFGQNWNGKGWTN 


121 


Query: 


114 


DGW--EKATIRTHYYDGVTHFIRPKF 137 








GW E T HYYD +FIR F 




Sbjct: 


122 


GVAQPGWG PETVTRHVHYYDD PMYF I RLNF 151 




Query= 


pt| 110882 44AHJDORF012 Phage 44AHJD ORF |8391-8813|3 1 





(140 letters) 

>gi|140S28|sp|P2481l|YQXH_BACSU HYPOTHETICAL 15.7 KD PROTEIN IN 
SPOIIIC-CWLA INTERGENIC REGION (0RF2) 
>gi|322189jpir| |B44816 orf2 5'of autolytic amidase - 
Bacillus subtilis >gi| 142801 (H59232) open reading frame 
2 [Bacillus subtilis] >gi | 1217874 ) dbj | BAA06959 | (D32216) 
ORF121 [Bacillus subtilis] >gi| 1303 767) dbj |BAA12423 | 
(D84432) YqdD [Bacillus subtilis] 

>gi| 2635036 |emb|CAB14532| (Z99117) alternate gene name: 
ygdO; similar to hoi in (Bacillus subtilis] 
Length =140 

Score =* 80.4 bits (195), Expect = 6e-15 

Identities » 45/130 (34%) , Positives =• 67/130 (50%) , Gaps m 3/130 (2%) 



Query: 


4 


VTCFRFTDSEAFHMFIYAGDLKLLYFLFVIiMFVDIITGISKAIKNNNLWSKKSMRGFSKKX 


63 






+ F D ++F G +K L L VL +D++TG+ KA K L S+ + G+ +K 




Sbjct: 


8 


INFETLDLARVYLF GGVKYLDLLLVLSIIDVLTGVIKAWKFKKLRSRSAWFGYVRKL 


64 


Query: 


64 


XXXXXXXXXXXXXXXXXXKGGLLMITIFYYIANEGLSIVENCAEMDVLVPEQIKDKLRVI 


123 






G L T+ +YIANEGLSI EN A++ V +P I D+L+ I 




Sbjct: 


65 


LNFFAVTIJVNVIDTVI^LNGVLTFGTVLFYIANEGLSITENLAQIGVKIPSSITDRLQTI 


124 


Query: 


124 


KNDTBKSDNN 133 








+N+ E+S NN 




Sbjct: 


125 


ENEKEQSKNN 134 





>gi| 4126631 |dbj |BAA36651.l| (AB016282) ORF45 (bacteriophage phi- 105] 
Length » 13 5 

Score « 76.1 bits (184) , Expect = le-13 

Identities = 44/115 (38%), Positives = 61/115 (52%), Gaps = 4/115 (3%) 

Query: 21 GDLKIAYFLFVLMFVDIITGISKAIKNNNLWSKKSMRGFSKK^ 80 

G++K L + VL +DIITG+ KA K L S+ + G+ +K 
Sbjct: 17 GEVKYIiDI>ILVLNIIDIITGVIKAWKFKELRSRSAWFGYVRKMLSFLWlVANAIDTIMD 76 

Query: 81 XKGGLLMIT I FYYIANEGLS I VENCAEMD VXVPEQI KDKLRVI KND TEKSD 131 

G L T+ +YIANEGLSI EN A++ V +P I D+L VI++D TEK D 
Sbjct: 77 IiNGVLTFATVXFYIANEGLSITENLAQIGVKIPAVITDRLHVIESDNDQKTEKDD 131 



>gi| 141088 |sp|P26835|YNGD_CLOPE HYPOTHETICAL 14 . 9 KD PROTEIN IN NAGH 
3*REGION (ORFD7 >gi | 1075967 (pir| | S43905 hypothetical 
protein D - Clostridium perfringens >gi| 455154 (M81878) 
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ORF D (Clostridium perfringens} 
Length = 132 



Score * 60.9 bits (145), Expect « 4e-09 

Identities = 38/127 (29%), Positives = 63/127 (48%), Gaps = 3/127 (2%) 

Query: 1 MNE^KFRPTDSEAFHMFIY-AGDLKIJjYFLFVLMFVDIITGISKAIKNNNLW 59 

+N +K+ +1+ A D+ L+ L V +F+D +TG+ K K+ L S +RG 

Sbjct: 5 INYIKWGIVSLGTIjFTWIFGAWDIPLITLL-VFIFLDYLTGVIKGCKSKELCSNIGLRGI 63 

Query: 60 SKKXXXXXXXXXXXXXXXXXXXKGGLLMITI-FYYIAKEGLSIVENCAEMDVLVPEQIKD 118 

+KK + I ++YI NEG+SI+ENCA + V +PE++K 

Sbjct: 64 TKKGL I LWLLVAVMLDRUiDNGTWMFRTL I A YFYIMNEG I S I LENCAALG VP I PEKLKQ 123 

Query: 119 KLRVIKN 125 

L+ + N 
Sbjct: 124 ALKQLNN 130 



>gi | 2293160 (AF008220) YtkC (Bacillus subtilis] 

>gi|2635548|emb|CAB15042| (Z99119) similar to autolytic 
amidase [Bacillus subtilis] 
Length = 134 

Score c 36.4 bits (82), Expect = 0.099 

Identities » 25/109 (22%) , Positives = 41/109 (36%) 

Query: 17 FIYAGDLKI^YFLFVXMFVT>IITGISKAIKNNNLWSK^ 76 

F + G LLM++I+K +LKK KK 

Sbjct: 20 FFFGGFQYSFLILLSLMAIEFISTTLKETIIHKIiSFKKVFARLVKKLVTIiALISVCHFFD 79 

Query: 77 XXXXXKGGLLMITIFYYIANEGLSIVENCAEMDVLVPEQIKDKLRVIKN 125 

+G + +1 +YI E + IV + + + VP+ + D L +KN 
Sbjct: 80 QLIOTQGSIRDLAIMFYILYESVQIVVTASSLGIPVPQMLVDLLETLKN 128 



>gi|H81973|emb|CAA87743.l| (Z47794) hoi in protein [Bacteriophage 
CP-1) 

Length - 134 
Score » 31.3 bits (69), Expect « 3.3 

Identities = 27/88 (30%), Positives » 36/88 (40%), Gaps « 5/88 (5%) 

Query: 29 LFVLMFTOIITGISKAIICKOTnJWSKKSMRGFSKKXXXXXXXXXXXXXX 86 

LF L+ D ITG KA K S ++G K G +L 

Sbjct: 18 LFALILFDFITGFLKAWKWKVTDSWTGLKGVI KHTLTFI FYYFVAVFLTYIHAMAVGQI L 77 

Query: 87 MITIFYYI ANEG LSI VEIN CAE MDVLVPE 114 

++ I Y A LSI+EN A M V +P+ 
Sbjct: 78 LVIINLYYA LSIMENLAVMGVFIPK 102 
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Table 21 

Phage 182 complete genome sequence. 17833 nucleotides. 

1 tagaatattg tcataaaaca caaacataat aatgcatatt attgtttaca aatatgtaat ttcgtgatat 

71 aatacatttg taagttaaag gaggtgacaa aagaacaaac cataaatgct ttagaaactg caaaaactat 

141 tggaggaaaa ataatgaaat attcactaca acaaatagat gaaattaaat caacaatttt cagaattaga 

211 ctaaaaaggc atgaactaga ggaattggtg gacgaagtaa acgatattgc taaagatccg gaggaaagat 

281 atcttttatc gttttattac acagaagaag aacgtttgtt tgaaattccc tctgcaagat taatagatta 

351 ttacaacgaa aagatcacaa atctgaaatc ggaaatcata tcactcgaaa aaagattaca aaaactagta 

421 aaataattac acaaaaagct ttacaaatat aacacatcat get at act aa aagagtagca agggaaegga 

491 aaatacctta cttcacacct caatcattct tatcaaaata caaaaggagg gaaaataatg ggtcgaaaac 

561 taatgcaacg aaaegtaaca tcaactaaag tagaattctc agaagttatc gtacaagatg gagcgccaac 

631 aattgtacca tgcgaaccag ttgtcttaac aggaaaactt tcagaagaaa aagctttatc agegatcaaa 

701 egtaaaaace ctgataaaaa cgtagttgta acaaatgttt cacatgaaac agegctttae acaatgecag 

771 tcgataaatt tatcgagtta gcagacaaat caacacaagc ctaataaaaa caaaactaaa acaaaacaga 

641 ggagattata atcatggaaa tegtaaaaag cacatttgac acacaaacac cagaaggaat gttacaagta 

911 ttcaatgeca caaaegggge ttcaattccg ttacgtaacg caattggega agtactagaa ttgaaagata 

981 ttctagttta ctcagacgaa gtttctggtt ttggtggagc cgaaccatca caagcagaac tagtegcttt 

1051 cttcacagaa gatggtaaaa ettatgeggg tgtatcagca gtagcaacaa aatcagctaa aaacctaatt 

1121 gatatgatga ctgctaaccc tgacatcaaa ccaaaaattt cttttgtcga aggaaaatca aacggtggac 

1191 aaaaatttgt aaatctacaa gtggtttcac tgtagcataa aaatacagga atctagtaag ccacttagcg 

1261 aatctegcta ggtggttttt attatgtttc tacattgagg tgtgtagaat tgaccgtaag aatatcaaag 

1331 aatgatagag ccaagttaga gaaaatctac ggtaaatcta acaaagctcg taaaaaatac aatcgtttaa 

1401 gacaaaaagg agttgaggaa aggcaacttc caactgttcc aacatcaaag aaaagactta ttgactacgt 

1471 aaaatcaaca aatatgagtc gtagtgattt taacaagatg ttagacgagt tggtagattt tgcacaacct 

1541 tacaacgaga attacatttt tgagatcaac aagcgaaatg ttgeaatetc aagagcgcaa atcaaagaag 

1611 cgcaaattaa aacagagcaa gctcaaaaag cgaaagaaga acactacaaa gagcttaaca aagttgaagt 

1681 taagaagece acagaaaaca caattgtcac accaactatt ttaacagagt taggtgctga cttacctttt 

1751 caagcaatac cagattttaa tattgacget ttcacttctc cagaaggagt tcagtcttat ttagaaaata 

1821 taggaaaaca agacgaacaa tattttgacg aaagagacca actttattac gacaatttca gaeaagegat 

1891 gtttactatt ttcaattcag aegctgaega tattgttcgt ttacttgact caatggggct tgatctattt 

1961 atgaaaacat atgttagtaa cttcttagac atgaaccttg actacattta tgacgaagca gaagtacaac 

2031 agaaaaaaga acaagtttac agtaagattg caaaagtgat cgagtctgaa acaggtggag aagtcccctc 

2101 atataacccc acgaagaaca tcacaattaa ttcagaaaca ggagaagaat tatgattaag aaatatactg 

2171 gcgactttga aacaacaact gatctcaacg attgtcgtgt atggtcgtgg ggcgtatgcg atatagacaa 

2241 cgttgacaat atgacgttcg gtttagaaat cgattctttt tttgagtggt gtaaaatgea aggcagcaca 

2311 gacatttatt tccacaacga aaaatttgac ggagagttta tgctttcatg gttattcaaa aatggtttca 

2381 aatggtgtaa agaagcaaaa gaagatcgaa cattctccac actcatatca aatatgggtc aatggtatgc 

2451 tttggaaatt tgttgggaag ttaattacac aacaacaaaa tcaggtaaaa cgaaaaaaga gaaatctcga 

2521 acaataattt atgatagect taaaaaatat ccttttccag tgaaacaaat tgeagaaget tttaattttc 

2591 ctataaaaaa aggegaaata gattatacaa aagaaagacc tattggttac aaaccaacaa aagatgaatg 

2661 ggagtattta aagaacgaca ttcagattat ggcgatggca ttaaaaattc aattcgatca aggactaact 

2731 cgaatgacta gaggaagega egctttagge gattacaaag attggctaaa agctacacat ggaaaatcaa 

2801 ctttcaaaca atggtttcct attttgtctt tagggtttga taaagactta cgtaaagcat acaaaggegg 

2871 cttcacttgg gtaaacaaag tttttcaagg gaaagaaata ggtgacggca ttgtctttga tgtcaactct 

2941 ttgtatccct ctcaaatgta egtaagaect ttaccatatg gaacacctct attctacgaa ggagaataca 

3011 aaccgaacaa cgactatccg ctgtacattc aaaatatcaa agtaagattc cgtttaaagg agggttatat 

3081 tccaaccatt caagttaagc aaagttcatt attcattcaa aacgaatatc ttgaatcaag tgtaaacaag 

3151 ttaggagttg acgaattaat cgatcttact cttacaaatg ttgacctaga attatttttt gaacactacg 

3221 atattttaga gatacattac acttaeggat atatgttcaa agcttcttgt gatatgttca aaggctggat 

3291 cgataaatgg atcgaagtaa agaacaccac cgaaggggct agaaaagcta aegecaaagg tatgttaaat 

3361 agcttgtatg gaaagttcgg aacaaaccct gacattacag gaaaagtgcc ttacatgggc gaggaeggea 

3431 ttgttcgatt gacactagga gaagaagaat taagagatcc tgtttatgtt ccgcttgcta gttttgtgac 

3501 ggcttggggt agatatacta ccattacaac cgctcaaaaa tgttttgatc gcattattta ttgtgataca 

3571 gatagcattc atctagtagg aacagaagtt ccagaagcaa tcgatcactt ggttgatcct aaaaaacttg 

3641 gctattgggg gcatgaaagc acatttcaac gagcaaaatt catteggcag aaaacatacg tagaagaaat 

3711 tgatggcgaa ttaaatgtaa agtgtgctgg tatgecagat cgaataaaag agattgtaac ttttgacaat 

3781 tttgaagttg gtttttcaag ctatggaaag ttgetaccta aaagaacaca aggtggcgtg gtattagtag 

3851 acacaatgtt tacaatcaaa taaggaggac taataatgga actatataaa gcaatgttta tcgtacgtga 

3921 tgaaggtact attgaeggtt acgatactga acactatgta gatatttctt tacatgactt tgaagaaata 

3991 tatggaaaag aaacacgtga aattgaagca gtaacattag taaaaacagg aaatttaaaa aaataaatta 

4061 tttacatcct ttgcaaagta tggtaaaata ttcttgtgat agttgacaag agtcaaattt ggcgagattg 

4131 ggcgaatgta cacgtgaaat ategtgeget cccgttaagt tatggacaca taaacgtttt gaccgtcaac 

4201 caategcaaa aaccttttag gagtagcect taaatgtggc tactcttttt tgtgtttcac agaattatgt 

4271 ttcacgtgaa acagttttta tggtataata gaatcaaaag gaggtggaga ttatggaaat taaagaacarr 

4341 gaatcaattt taaatggtat tcttgaaagt gtcacagacg gtgaagcaag atcaaagatt gtagaacatc 

4411 ttgaagcatt gegagaagae tacggagcaa caactgaagc tttgacatca gcaaatagca cacttgaaaa 

44 81 gttaaagaaa gataacgaag cgttggttat ttcaaactca aaattgttcc gagaacgagc gategtagaa 

4551 ccagcagaaa ataacgaacc agaaacagac cagaatatta cactagacga tttaggaatt taaggaggaa 

4621 aaaacatggc tgacaaaatc acagaacaag atgttcttcg tgccacaaat gtagaaacac cagtacaatt 

4691 aatgactget atttataata gttcatcatc tctttttcag gegaaegtae etatgecaaa tgcagataac 
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4761 atcgaagcgg ttggtgcagg gatcacacgt 

4831 accgtattgg taaagtagtt atccgataca 

4901 catgccttta ggtcgaacga ttgaagaaat 

4971 gagtctgtta caggggtatt taaacaggaa 

5041 aaggttacta caaacaaacg atccaagaag 

5111 tagtttcgtt gctggtgtaa tgaacgcttt 

5181 ttattaatag caaactacca agaaaaagag 

5251 atgcaaaaga atttatccgt aagatcaaat 

5321 cgctcaagga gttaaaacat ctacctcaaa 

S391 accattgacg ttgacgtttt agcagcggca 

5461 ttattgatga gtttcctaaa aaagaaggcg 

5531 atggtttatg atctacgaca aattgtacaa 

5601 tattggttgc accaccacca actatattct 

5671 caacaaaacc tgtcacaaaa gttgcttttg 

5741 tatcgcattg acatttacac cagtagaagc 

5811 ttggttaagg caaccgtaaa acaaacagca 

5881 gtcaatcatt agtaacattc acagctatcg 

5951 ctaaggagga caattatggc aagaaggtat 

6021 cctatacaca cacaagatgg tttaaaactc 

6091 taacgagaat agagattgtt cttatcaaag 

6161 aaagacgcct tatatgcttg taactatctc 

6231 atgcctttgt tactgatatt gaatataaga 

6301 acaaacttat cgtttcgata ttggtatacg 

6371 tcgaatggaa tacctttcat taatacaatt 

6441 atgtaacaac ttttcatcct aacgatggag 

6511 tggagataag gaagataaat caggaggatc 

6581 cctatcaatt caagtgggga ggtatacaaa 

6651 cgtttcttac aacgaaagaa ccttttttaa 

6721 accattcatt gtggatzcacg cgaacaaaac 

6791 ccaacctacg ctagtgatcc aacaggaaca 

6861 cattcgtacc taaaagaatt gatcttgtag 

6931 tgttaaggaa tcaaaactat ttatgtatcc 

7001 atgactttaa gacctgaata tcttacaggt 

7071 ctaataaagt gatgatcgag ccgattgatt 

7141 caagatgtta atcgataatg atcctaacga 

7211 ggaaacaaaa actccttgat tgctcaagag 

7281 gtgcaatgag tacaggagga gcgatctttt 

7351 catcatggga gcaggacaac aagtaaacaa 

7421 ggtaaagtgg cagatatcga aaatattcca 

7491 caggaaactt tcaaaactat tatcaattgc 

7561 tcgttacttc tcaatgtatg gcacaaagag 

7631 tggaatttca ttaaattaaa agaaccaaat 

7701 aacaaatttt tagtgcaggc gttacgcttt 

7771 agatgtatag gaaggaggaa taagatgagt 

7841 cagcaaaaag cagaccttat ccaaatgaac 

7911 ttatcgtaga caactcacgc tccttacgtt 

7981 cctcgttatt tagaaattgc tttacacact 

8051 tcatggtttg cgcaggggca gaagatggtc 

B121 cgaagcaatg tatcacaaga gatatcctgt 

8191 atgttgtata ataatgactt gaaagttcct 

8261 acataaacca gatatcacga gtgaatcgaa 

8331 gaaatacttc tcattgctac aagcttataa 

8401 gatatggagt ttgacgaatc ttttaatgta 

8471 cagaattgaa cgaagtatgg aatgaagtgt 

8541 tgcacgtgta caaacatcag aagtcttatc 

8611 aaatcaagaa aagagttttg cgatcgtgta 

8681 tgaagtttag aacagacgcc gttcgacaat 

8751 tggagggttg ccaagtgcta cttaaacgtt 

8821 aaaagaacgt attgaagttg gccgaaaaca 

8891 cgagcagaat ttgaaacaaa atttatcaat 

8961 catttaagtt taatcttgac gaatatttaa 

9031 tcttgaagag tttccgattt ttgatgacat 

9101 attgatacaa acatcaaagc gaatcgtgat 

9171 acagaaacaa aaatacacgt gacacaggaa 

9241 tcaaaaagat ttgagaattg ccagcaatgg 

9311 gaagatttga gtaaagaaac aacaagctcc 

9381 cacgaagcaa tgcttctgaa aaagaaacaa 

9451 tacgattaca cgatataaag gtaaaaaggg 

9521 agtgttttga gaattgagaa aatgatcttt 

9591 gagggaggta gcaacaatgg tagattttaa 

9661 gaacgcttta gcaaatatcc tcatactgaa 

9731 taattgccta tctgaatgaa gttggtgctt 

9801 acattttgtt gagaagttag aagagatcac 

9871 gaaaatttaa tcaatgatac tgtttttgca 

9941 ctgaaacacg tgctaacagt gtgaatattc 

10011 attttggtat aagattcaac gcgacaatac 
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ttagacgtag taaaaaacga atttatttca actttagttg 
aatcttggcg taaccctttg aaaatgttta aaaaaggaaa 
ttttgttgac attgcacagg aacataagtt caaccctgac 
gttcccgatg taaaaacatt gttccacgaa attaatcgtg 
catggttaga aaaagcattt acttcatggg ataatttcaa 
atacacaggt gacgaagtaa gcgaatttga atacacgaaa 
ctattcaaag agatcgaaat tggcgaaatt actgaatcaa 
caacctctaa caaattagaa tttatgagtt ccgcttacaa 
atctgatcaa tacgttatta ttgacgccga cacagacgca 
ttcaatatga gtaaaactga ctttgtagga cacaaaatcg 
aagaatcgtc aaatattgtg gcagttattg tagatagtga 
aacaacaagt ctatacaacc ctgaagggtt atattggaat 
acttctcaat tcgggaacgc tgttgctttt gttaaatcag 
caagtgcaac aactagtgtt gttaaaggat catctaaaga 
aacaaaccaa caaggagaag ttgtttcatc agcaccagca 
ggtaaagcga ctgccgtaac cgtagaaggc ttagaagtcg 
gaggtcaaca agcaacggtt cttgttacgg ttacttctga 
acaaatgtaa aattgttggc taacgtgcct tttgataaca 
aacaggaaca ggaatcgtac tttaattcgt ttcctgttct 
ggatacacaa ctcgggggag tttttagagt agataaacac 
atctttaaaa acgaagaaac ttatcctagt aaatggcagt 
atgacaacac aagtttcgtt acctttgaaa ttgatgtttt 
agaaagtttc attgcaaaag aacaccctca actttattat 
gaagagtcgc ttgattacgg tagagaatac acaacaacaa 
tcaattttct tgttattcta acaagtgaag caatgccagt 
aatagtaggt ggcccatctc ctttttccta ttatttactt 
ccaaatgggg caggcaatgc taattttgga gagtacatgg 
ataagatagt cgggatgtat gtaacgtcgt atacaggtat 
ggtaaggtat aatgcaggag gttcttataa gatcatgctt 
atgaaaacat tcgctttctt ttgtgtaaaa gaagcaagaa 
ggaacgtgta taactacttt agagaagctt ttccgtttaa 
ctattgttta atagaaatta cagatacaaa aggacatgta 
ggtaaattga gtgtatatgt aaaaggttcg ttaggaattt 
atgatgtaag taactcaacc attattacca atttaagtga 
tgtaggagtt aaatctgact atgcttctgc attcatgcaa 
caaaacattc gcaatacttt cagacatggt atgggaaaca 
cagccttagc aagtaacaac ccttttgttg gtttgactaa 
ctatgtttct gaaaaagaaa acggtttgaa cctcttggca 
gataatgtaa cacagcttgg atcaaactta tctttcacaa 
gcttcaaaca aattaaatat gagtatgcaa caagacttga 
caatcgagta gctacaccaa acttacaaac aagaaaagca 
attgtaggca caatgagtaa cgatgtatta acacgtgtga 
ggcatacgaa tgatgttttg aattataacc aagacaacgg 
agacgaaaag gtgcaggact tgctagaaat aaccgttata 
cctattcaag tgatgtagaa gaaatcagct actatgaaca 
tcagttgttt gaatgggaaa atttgccaaa atcaattgac 
aatggttatc ttggtttctt taaagaccct acacttgggt 
aaatcgatca ttatcacaac cctattttct ttacagcaaa 
tttaagatat gatgatgatg atgataaatc aaaatgtatc 
acgttaccaa gtttacatcg ttttgcttta gatatggcgg 
gagcgcaaaa aacacctgta attattcaaa ctgatgaaaa 
ccaaattgac gaaaataatc aggctgtttt tgtggataaa 
tggcaaacaa atgctccata tgtagtagat aaactacgat 
taacttttct aggtatcaac aatgctaacg tagataagac 
taacaatgaa cagattgaaa gttcaggtaa catcttgtta 
aatcgtgtct ttggcgatga acttgacgga aagattgacg 
tacaactggc ggcaggtcaa tcaaaaaaag accagatgag 
atattgaaag tttcacttat taccaacctg aattatctcg 
attgtttgat tttgattatc cgttttatga cgaaacaaaa 
cacttttact tgagagagat aggctcagaa acgatgggat 
atctaaacat gccctattgg aataaaatgt tcctatcaaa 
ggactacacc attgatgaga aacagaaatt gttaaatgag 
gaatcgaaga accaaacgaa gcaagtagat caaacagaca 
caaccgattc tttctcaagg aacacttata cagacacccc 
agatggaaca ggtgtaatca attatgcaac aaatatcaca 
acaggcgttg aaacaaacaa cgacaaaaca aatcaaaata 
agaacacaga cattaataaa gatcaaaatc aaaccaaaga 
aaacactgat tatgctgact tactcgaaaa atatcgtaga 
agagaaatga acaaggaagg cttatttctc cttgtttatg 
ccccgacaag cggtttgacg gtttacccgc tgtattcaaa 
tacagatatg aattactatt agatgaagaa gtatcggctt 
tagttaatga tatgagtggt tatttaaatt actttatcga 
aaatgacaca ctcaaaaaat ggttgtctga tggtacgtta 
aattatatca aagaaatcaa aagattacaa atcttggttg 
ttttgacaaa aaataaaccg gatgttgctg atgatcgaac 
tgattatgga gccgatccta ttgacacgtt acgtattgtt 
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10081 gcaatcaata aagttagtgg ctggaatacc gctacaggag atatttatct taacattaaa ggaacggagg 

10151 gtgtataatg gcagacatta gaacacaacc aacaagtgaa gatggatcag acaatttatt tccaatttca 

10221 aaagccgtta atattatgac taatagcggt acgaatgtag aaggagaatt gggtacactc aaacaaaatg 

10291 acgaaacaat gaatacctca gttcaaaatg ctgtagttac tgccaatcaa gcaaaagatt ctgtagctga 

10361 attaaatgta aatgttggta aactaaccaa tcgaataaca acattagaga gtacagtggc taatcttgat 

10431 ggtactcgtt atgtagaggt gtaatatggc agataaaaat attcaaatgc aggataaaga tcataatcgt 

10501 ttaatgcctg ttacaattgc taaaaacgtt ctaacaggcg actctaatct tgaattagtt aatgctgaaa 

10571 taagaggtaa cgctagtgaa gctaaaacac ttgcacaaca agctaaagaa actgctgctg gtttgtcaac 

10641 agaaattgac acagtaacat caaccgcaaa tcaagcgttg acgaaggctg gtacagcaca acaaaccgca 

10711 gaacaagcga aaacaacagc aaacagcatc agcgcagttg caacggcagc taaaaacaca gctgattcag 

10781 cacaaaaaag tgcaactgat ctagctgttc gagtaagcag tttagaggac acagcaatac aatatactgt 

10851 at Caeca tag gaggaaaaat aatggcaaat aaaaatattc aaatgaagga tagcaatgac aataatttat 

10921 atccaagtgt tcgagcagaa aacttgttag atttgaccag tcgtgctgaa ttaacaatga caaattgtca 

10991 attatatgea gctggtgata aaacaaatgc aatctcttat ctcggtgcag taggtatget cgaaggtatg 

11061 ataaagttta ctgaaagttt gacaaaccct gtgaccacaa cgctaccaga aggttttaga ccaataagaa 

11131 caaaaegtat tggttgtttc gcaaaatact acacaccaaa tccaacagat acaaaagaaa tggtttatgt 

11201 atcaatcaca cctgatggca aagtaactgt aaatgacaat gtaggtaaaa tcgaatatct atccctagat 

11271 aattgcgttt tccctctaaa ataaggaggt tcatatggaa gaacgaattg atattcaaat gaacaagatg 

11341 aaagaagaaa atcaaaagaa ttacctattg caccctgaaa cgaacccgaa acaagttgtt tttgatgaaa 

11411 cattgeatgg aaatgaaaat caggagagtt tcaacaattt tgttgacaca agaaaaatga caactacaat 

11481 tgatgtaagt gcttatgggg ttatcgctga cggtgtaaca gattgtacac caatattaaa taaattactt 

11551 gaagaaaaaa gcgaaatggg tatcactttt tattttcctc cttgtgaacg tgattcatat tategctttg 

11621 ctaacaccat tgaattgaaa cgtgatgtac ctgtagttac tttcttagga tegggagaaa cgacattaaa 

11691 gtttgaaaca atgaeggcat ttaatgtaaa catcgaaagt ttcaatattg atggttttgc attatggttg 

11761 ccacaaggcg ctcaaagtgg taaaggaatt ttctttaatg atactegcaa ttacaatcgt tttgactttg 

11831 atttgtttgt tegtaactgt actttaaatg aaggaacgta tgttgttgtt gctagaggta gaggggttac 

11901 atttgaaaat tgtctattct ctaatatctc tcaagcaatt atcaaaacag cttttcccga tgtaaatggt 

11971 atgtggcaag ggaacgatat caatactagg ggtacaggtt ttagaggttt ctttgtgaaa aacaacegta 

12041 ttcatttttg tacagegate attatcgaca atgacgatga ttatcagaat gtaattaatt tctgtgaaat 

12111 ttctggtaac acaatcgaag gtggcgtaag ttattatcga ggatatgege ataacttgea tgtccaaaac 

12181 aacaaccatt ttctagcata eggaaataga aacgetttgt ttgagtttca agatgtggat caagcttata 

12251 ttgatgtaga tgtttattgt cgtaactcac aagtcgaggg aatgaatagt acagctattt cacgtttaat 

12321 tgttgtttac ggacattacc gaaacttaaa gattacaggt aaattatatc gttgtcaagg acatgttatc 

12391 acgttgtatg gcggtggcgt taatttctat tgtgacttga tggcacaaga agcacctttg aeggaeggtt 

12461 aceggtttat teaaaegget gacaatcgag ttaactatga tgggtttgtt gttcgtggtt tgtctaattc 

12531 aacaaaagta aatacaccaa tgatctataa agcacctcag actgttttct ataategtag aatcgatcat 

12601 gtgctaacag gtccaaatgc aagtaatgta tataactagg aggatatgag atggcaactc ttacaaatga 

12671 acaaatagct agaggacaaa caategctaa aatactttca aaatatggct ataataaaaa ttcacaagta 

12741 ggagttgtcg ccaatctcca ttgggaatcg gctggtttga acccgaacag caatgaatat ggtggaggcg 

12611 gatatgggtt aggtcaatgg aegectaaaa gcaatcttta tcgccaagca caaatttgtg ggttgtctaa 

12881 tgetaaaget gaaacgttgg aaggtcaagc agagatcatc gctcaagggg ataaaacagg tcaatggatg 

12951 gataatacac ctgtttcttc tgcaggttat actaaccctc agaccctttc agcatttaaa caatctgeaa 

13021 atattgatgt tgctacaatt aattttatgt gtcactggga acgccctggt aaacttcata tcgaagaaag 

13091 acttgatctt gcacaagctt atagtaagca tattgaeggt agcggtggcg gtggcgtaaa aegttgetat 

13161 ggaaccccaa tcaagaatac aaatcttgat cctaaaagtt tcatgagtgg acaacttttt ggcacgcatg 

13231 caggaaaegg cagaccaaat aatttccatg atggtttgga ctttggttca attgatcacc ctggcaatga 

13301 aatgattgea tgttgcgatg gaacagtaac acatgttgga acaatgggag cattaagagc gtattttgtg 

13371 ataaatgatg gtacttacaa tategtttat caagaattta gttataacca gtcaaatata aaggtaaaag 

13441 ttggcgacaa agttaagaac ggacaagttt gcgcaatacg tgacgeggat catttacatt taggttttac 

13511 taaaaaagat tttatgactg cgttaggatc ttctttcata gatgatggaa catgggaaga ccctttgaag 

13581 tttttagggc aatgttttgg agatggagat actggeggag ataatgacga taacaataag gataaaaatg 

13651 atcttattta tetattgeta tccgatgcct tgaatggttg gaaattttaa taaggagaaa aaggtatgat 

13721 agaatatatc acacaatggt tggcagatga taatcatctt gtttatggtt tgattatatg gttaatggtt 

13791 gcaatgatta tcgattttgt gttaggtttt acaattgeca aatttaacaa ggaaatcgac tttagtagtt 

13861 ttaaagctaa agcaggtatc attgttaagg tggcagaaac ggttttagtg gtttacttta ttcctgtagc 

13931 agtaaaattc ggtgcagtag gtattacaat gtatataaca atgttggttg gtttgatttt atcagaaatt 

14001 tatagtatac taggacatat ttcagatatc gatgatgata ataattggac tgattatgtt aagaagtttt 

14071 tagaeggaac actcaacaga aaggacgata ttaaatgatg aatggtattg atatctctag ttatcaaaca 

14141 ggaattgatc tttcaaaagt tecatgegat tttgtaaata ttaaagcaac aggeggaaca ggttatgtaa 

14211 accctgattg tgaccgagca tttcaacaag ctttgtcttt aggtaaaaag attggtgtgt ateattttge 

14281 gcatgagagg ggtttagaag gtacacctca acaagaagcg caattctttt tagataatat taagggttac 

14351 attggtaaag ctgttcttat tcttgacttt gaagggtcaa atcagaaaga tgtaaattgg gcgaaagcat 

14421 ttcttgatta tgtttataat aaaacaggcg ttaaagcatg gttttatacg tatacagcaa acctcaatac 

14491 aactgatttt tctagtattg caaaaggega ttatggttta tgggttgctg aatatggatc aaatcaacca 

14561 caaggctact ctcaaccagc gccacctaaa acaaataatt ttccaattgt tgcctgtttt cagtttacaa 

14631 gtaaaggacg tttaccagga tacaaeggea atcttgattt gaatgttttc tatggcgatg gtaatacatg 

14701 ggatctgtat gtaggtaaaa aacaggatca aattgttcct cctgaaaata aaatatttga cgccacaagt 

14771 gatgagttta ttttcactct tacaacaggt ageacaageg tgttttattt tgacggagaa acgatctttg 

14 841 aattgtctga tccaacacaa ctcgatcata ttagaggaac atacaatcat gttcatggaa aagaaatccc 

14911 atcaatggtg tggacacctg aacaattcga tatttactta aaaatgtatg aaaagaaacc agtatataaa -'• 

14981 taggagtgta tagtatgaca aatagcttag gcgttaaact tgaagagaaa aacttatact ataaccctaa 

15051 caatgettta ggttttaatt gcctaatgtt gtttgtaata ggcgcacgtg gtataggtaa aacttatggt 

15121 tataaaaaat ttgttgttaa tegctttatt aaacacggcg aacaatttat ttatttaaga agattcaaaa 

15191 cagaacttaa aaagattcct caatttttca aaacaatggc gaaagaattt cctgatcata aacttgaagt 

15261 aaaaggaaaa gaattctatt gtgatgataa attaatgggt tgggctgttc cacttagtac gtggggaatt 

15331 gaaaaatcta atgaatatcc cgaagttcgt acaattttgt ttgatgagtt tttaattgag aaatcaaaaa 
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15401 tcacttattt accaaacgaa gctgaagcct 

15471 tacaagatgt gttatgctga gcaatgcaac 

15541 ccagatttga ataagcgttt taatctatat 

15611 actttgcaga agtgaagaga gaaacacctt 

15681 tatcaacaat gagtttgtca atgatagtga 

15751 tgcgccattg cttttgaagg gaaaatcttt 

15621 gttatgatta tcaaccaaat acaaatcatt 

15891 gctgatgaaa aactggcgaa ataattatta 

15961 cggtttgata acattgttat taagaattta 

16031 attttagtag agctaccacg attagttcta 

16101 gcgatagttt tgttttggtt ctttggcgtt 

16171 ggtgtgttaa tgtagacgaa atcttttctc 

16241 aaatgtagct ataggacgtc catttctttc 

16311 cggctatatt ttaatgcttt tgttaaggtg 

16381 tataaaatac tgtgatatcg tatattggtt 

16451 ccttttggta tttgtaacgc taactgatag 

16521 cctgacaata cttttcaaga atgttaaatt 

16591 ttcggtgata tttatttccg gaacgtcgaa 

16661 tctgaaaggt tacgtttaca gtagaaacgt 

16731 caatcatttt aattcctcct atttgtccgt 

16801 tgttcaacgc ttttcattga tttcgttatt 

16871 atttatcatg tgttaacacg aactcttttg 

16941 tgtcatttct gacttgatag acgctaaact 

17011 attaatgata aattgttaat catgtaaaac 

17081 ataagattgg tagcattgta tcgaattaat 

17151 acccatatct aattccttta gttcttcaaa 

17221 tcaataagat aatgtttatt gttttcggta 

17291 gaagtagaga tacctctcct ttttcagcta 

17361 aatttgatat tgataccacc aatcaaatgt 

17431 attgagaaag tccagttatc atcaaatgaa 

17501 caaattctaa atagaggaat ttactaagtt 

17571 tgaataaatt tctgtgtata cgatcggttc 

17641 atcatgtatt tacatatatg tcaatcattt 

17711 gatcctttct ttattacatc tatattatat 

17781 tgtagtttgg ggtcagttac atttgtgtta 
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tattgaacat gatggaaacg gttttccgaa gacgtacaaa 
tagtgtagtg aacccttatt tcttgtattt caatctgcag 
caagatcgag gtatattgat tgaattgtgt gattcaaaag 
ttggtagatt gat teg egg a acagaatacg aagattttag 
tacgtttatt gaaaagagaa gtaaaaatag tagtttctta 
gggtattgga tagaegctga aacaggttgt gtctatgtga 
tttatgcaat gactacgaaa gaccatgaag aaaatagatt 
tctttcaaca gtggcgaaag cattcaagaa tagttatctg 
cattatgatt tgtttaataa gatgaaaatc tggtaaccct 
ttacaatgat gaatagtaga taacatagta attgtagtct 
agtgattttt gctaacgcct ttttgtttgc ttttggatcg 
atagttcttt ctccttatac agttttaata attccctgta 
tattctaacg caattcacta tatccatttc taggtatata 
agaggttegg tcttgtgtat caaaacctcc caaccatcta 
ccttgtagaa tgtagecatt attccacctc ctttaaatag 
cgagaaccaa ettttaegta tgaagttact aatttcattg 
gactcgattc gggtaatagc gttgaatgag ttaacaaaag 
atcttgtaaa gtcccctcta tgatctctat tttttcattg 
aaccattcaa ttagttcgcg gtgttctttg aatgttcgtg 
aatttgttta tatcegtcat gtttcaattg ttcegcatag 
gcgatattaa tgcaatggct atcaagataa acatagttat 
taaegtaate aatgtataaa attaattgtt ttcctccttg 
atcgttgtca tctttagtta gttgatttaa accctctaaa 
actcctttta tattaatttg atattgatac caccaatcga 
atgttatttc tgtagttttc catgaatact eggaaataag 
agataacaaa caatattcct catcgcctac ctcatcaata 
tctatgatat gataattcat atcccactca ttaaaggggt 
ttaatgattt attgttcata tgaaacactc cttttatatt 
gattggtagc attgtattaa attaatattc tggataattt 
attgttttat tttcaagtaa etttttagee tcatccacct 
tatcctcatc tctaaaaatt ttcatacata ecaegttatt 
attcatgttt atcatccttt ctttattaca tatatagtat 
aattcattta ttttaatgat ttatttgatt gtttttttat 
catgtatgat tgtatttgtc aacaattaaa ttcatataaa 
tcaaaaaaag ataatattct att 
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Table 22 



Phage 182 ORFs list 



nb 


Name 


Frame 


Position 


(!!!?> I Keywords 




182ORF001 


2 


5966..77B0 


604 ; Tail protein; 


2 


182ORF002 


1 


2152..3873 I 573 I DNA polymerase; 


3 


182ORF003 


1 


1 1 *V1R 1 9A*iQ 
I luUD.. I4COO9 


AAA 1 
*l*t*f ! 


4 


182ORF004 


3 


*t0£0..Q9O < r 




5 


182ORF005 


3 


l^DOl ..1 0/UU 


«»3 i oiycyHuiycinc cnuupepuucisc, uysosiapnin precursor. 


c 
D 


1ft90RFAftfi 
1 OtUixruUO 




14993.. 1 OUZO 


o4J i cricopsiuauon proiein, m i o/o i r*-oiriQiny sue muuT m, 


7 


1R9nRF007 




770C P77C 

77 yO..Of to 


326 ' Upper collar protein; 


Q 
O 


182ORF008 
i Uturxruuu 


2 


141UO.. l**90o 


4t9^ I LyoUiyiiic, mui dmitmsc, 


Q 
9 




2 


I J 1U..Z100 


£oi i i erminai pruiein, 


in 




2 


Of oD..you i 


zro i Lower collar protein, 


1 1 

1 1 


1ft90RFni1 




you/..iui do 


loo t r~rtj-necK apponodye proiein, 


12 


182ORF012 


3 




140 i 


13 




1 


IVtuO,. IUOOU 




14 


182ORF014 


3 


1171R 141flfi 
lor TO.. i*» lUO 


iou i Lysis pruiein, 


15 


182ORF015 


2 


004..1<££0 


\£.o \ Cdny pruiein, 


16 


182ORF018 


.2 


1R717 
IO*4}<£9.. (O/ Of 




17 


182ORF020 


3 


1 U 1 DO. . 1 U*rO*t 


90 i LeuariB-iCippcr moiii, 


18 


182ORF019 


3 


40*iJ..**O 1 o 


vjo i ricau proisin, 


19 


182ORF016 


•3 


10/ 49.. 1 /uoo 


QA ; 
9*r 1 


20 
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1Z00O..1 j1*tS 


9<J I 


21 
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«2 


lib 1 lOM 




22 


1fl2ORF017 




134.^*0 


Oft I 
9U I 


23 


182ORF024 
i U4vr\rv4t 


3 


unA haar. 
Ol f 4.,0**40 


on i 


24 


1890RFD95 


2 


0*KJ..O l*+ 


88 i Early protein; 


25 




•3 




86 i 


26 




.1 


iARA9 1A0QR 


84 I 


27 




3 


1j1R79 


80 ! 


28 


182ORF021 


-3 


1 f IUO.. 1 r 009 


77 I 


29 


182ORF030 


_1 


1A1QQ 1RA9Q 
10l99..10*r*9 


76 I 


30 


182ORF031 


-3 


OJf 9..00Ui3 


74 I 


31 


182ORF032 


•1 


4<MQR 1^11 
1 1 ISO..! l«MO 


72 I 


32 


182ORF033 


_1 


/1 797 AQAO 


71 I 


33 






COC4 

0301,. OlOU 


69 I 


34 


1 (J4. W 1 \l U£ 9 


-3 


1 f 41*t,.l /DUO 


64 J 


35 


ift^fiRFrvw 


-3 


1DO/U..10/00 


62 I 


36 


1ft20RFmfi 


•3 




62 I 


0/ 


i o^unr uo / 




l£U9D..l£<£OU 


61 i 


38 




3 


l**f 09.. I 


60 ! 


39 




2 


GQQ9 10171 


59 I 


40 


182ORF040 


•3 


1 RTI9Q 1 R9fi9 


57 | 


41 


182ORF041 




<3O0O..Hu3O 


56 I Early protein; 


42 


182ORF042 


-3 


10671 10832 

1 Uw 9 1**1 Ww4& 


53 I 


43 


182ORF043 


-3 


10491..10652 


53 I 


44 


182ORF044 


-1 


6299..64S7 


52 I 


45 


182ORF045 


.2 


6571. .6729 


52 I 


46 


182ORF046 


2 


2372..2527 


51 




47 


182ORF047 


-2 


13201..13353 


50 




48 


182ORF048 


-3 


3243..339S 


50 




49 


182ORF049 


3 


1578..1724 


48 




50 


182ORF050 


2 


6012..8155 


47 




51 


182ORF051 


3 


9390..9530 


46 




52 


182ORF052 


1 


4096..4233 


45 




53 


182ORF053 


2 


15656..15793 


45 




54 


182ORF054 


-2 


8002..8136 


44 




55 


182ORF055 


2 


8324..845S 


43 




56 


182ORF056 


3 


6549..6680 


43 




57 


182ORF057 


-3 


8133..8264 


43 




58 


182ORF058 


-1 


5048..5176 


42 




59 


182ORF059 


-2 


15748..15876 


42 




60 


182ORF060 


-3 


15276..15404 


42 




61 


182ORF061 


-3 


1974..2102 


42 




62 


182ORF062 


-2 


1867.. 1992 


41 




63 


182ORF063 


-3 


14181..14306 


41 




64 


182ORF064 


-2 


7234..73S6 


40 
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65 1 182ORF065 


-2 


3460.. 3582 


40 ! 


66 I 182ORF066 


1 


4234.. 4353 


39 I 


67 I 182ORF067 


-1 


13763..13882 


39 I 


68 I 182ORF068 


-1 


71 48. .7267 


39 


69 I 182ORF069 


-3 


4908. .5027 


39 i 


70 I 1B2ORF070 


-3 


912..1031 


39 ! 


71 


182ORF071 


2 


11741..11857 


38 ! 


72 


182ORF072 


-3 


11610..11723 


37 1 


73 


182ORF073 


-3 


2763..2876 


37 ! 


74 


182ORF074 


-1 


8813..8923 


36 i 


75 


182ORF075 


-3 


73S3..7463 


36 i 


76 


182ORF076 


-3 


2316..2426 


36 ! 


77 


182ORF077 


2 


11658..11965 


35 ; 


78 


182ORF078 


-2 


7564..7671 


35 1 


79 


182ORF079 


-2 


7381 ..7488 


35 1 


80 


182ORF080 


-2 


4372-4473 


33 i 
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Table 23 

Predicted amino acid sequences of ORFs from phage 182 

182ORF001 



5966 atggcaagaaggtatacaaatgtaaaattgttggctaacgtgccttttgataacacctatacacacacaagatggtttaaaact 

1 MARRYTNVKLLANVPFDNTYTHTRWFKT 

6050 caacaggaacaggaatcgtactttaattcgtttcctgttcttaacgagaatagagattgttcttatcaaagggatacacaactc 

29 QQEQESYFNSFPVLNENRDCSYQRDTQL 

6134 gggggagtttttagagtagataaacacaaagacgccttatatgcttgtaactatctcatctttaaaaacgaagaaacttatcct 

57 GGVFRVDKHKDALYACNYLI FKNEETYP 

6218 agtaaatggcagtatgcctttgttactgatattgaatataagaatgacaacacaagtttcgttacctttgaaattgatgtttta 

85 SKWQYAFVTDIEYKNDNTSFVTFEIDVL 

6302 caaacttatcgtttcgatattggtacacgagaaagtttcattgcaaaagaacaccctcaactttattattcgaatggaatacct 

113 QTYRFDIGIRESFIAKEHPQLYYSNGIP 

6386 ttcattaatacaattgaagagtcgcttgattacggtagagaatacacaacaacaaatgtaacaacttttcatcctaacgatgga 

141 FINTIEESLDYGREYTTTNVTTFHPNDG 

6470 gtcaattttcttgttattctaacaagtgaagcaatgccagttggagataaggaagataaatcaggaggatcaatagtaggtggc 

169 VNFLVI LTSEAMPVGDKEDKSGGSIVGG 

6554 ccatctcccttttcctattatttacttcctatcaattcaagtggggaggtatacaaaccaaatggggcaggcaatgctaatttt 

197 PSPFSYYLLPINSSGEVYKPNGAGNANF 

6638 ggagagtacatggcgtttcttacaacgaaagaaccttttttaaataagatagtcgggatgtatgtaacgtcgtatacaggtata 

225 GEYMAFLTTKEPFLNKIVGMYVTSYTGI 

6722 ccattcattgtggatcacgcgaacaaaacggtaaggtataatgcaggaggttcttataagatcatgcttccaacctacgctagt 

253 PF I VDHANK.TVRYNAGG SYK I M L PTYAS 

6806 gatccaacaggaacaatgaaaacattcgctttcttttgtgtaaaagaagcaagaacattcgtacctaaaagaattgatcttgta 

281 DPTGTMKT FAFFC VKEART FV P KR I DLV 

6890 gggaacgtgtataactactttagagaagcttttccgtttaatgttaaggaatcaaaactatttatgtacccctattgtttaata 

309 GNVYNYFREAFPFNVKESKLFMYPYCLI 

6974 gaaattacagatacaaaaggacatgtaatgactttaagacctgaatatcttacaggtggtaaattgagtgtatatgtaaaaggt 

337 EITDTKGHVMTLRPEYLTGGKLSVYVKG 

7058 tcgttaggaaettctaataaagtgatgatcgagccgattgattatgatgtaagtaactcaaccattattaccaatttaagtgac 

365 SLGISNKVMIEPIDYDVSNSTI I T N L S D 

7142 aagatgttaatcgataatgatcctaacgatgtaggagttaaatctgactatgcttctgcattcatgcaaggaaacaaaaactcc 

393 KMLIDNDPNDVGVKSDYASAFMQGNKNS 

7226 ttgattgctcaagagcaaaacattcgcaatactttcagacatggtatgggaaacagtgcaatgagtacaggaggagcgatcttt 

421 LIAQEQNIRNTFRHGMGNSAMSTGGAIF 

7310 tcagccttagcaagtaacaacccttttgttggtttgactaacatcatgggagcaggacaacaagtaaacaactatgtttctgaa 

449 SALASNNPFVGLTNIMGAGQQVNNYVSE 

7394 aaagaaaacggtttgaacctcttggcaggtaaagtggcagatatcgaaaatattccagataatgtaacacagcttggatcaaac 

477 KENGLNLLAGKVADI E N I PDNVTQLGSN 

7478 ttatctttcacaacaggaaactttcaaaactattatcaattgcgcttcaaacaaattaaatatgagtatgcaacaagacttgat 

505 LSFTTGNFQNYYQLRFKQI KYEYATRLD 

7562 cgttacttctcaatgtatggcacaaagagcaatcgagtagctacaccaaacttacaaacaagaaaagcatggaatttcattaaa 

533 RYFSMYGTKSNRVATPNLQTR KAWNFIK 

7646 ttaaaagaaccaaatattgtaggcacaatgagtaacgatgtattaacacgtgtgaaacaaatttttagtgcaggcgttacgctt 

561 IiKEPNIVGTMSNDVLTRVKQI FSAGVTL 

7730 tggcatacgaatgatgttttgaattataaccaagacaacggagatgtatag 7780 

589 WHTNDVLNYN. QDNGDV* 
182OS7002 

2152 atgattaagaaatatactggcgactttgaaacaacaactgatctcaacgattgtcgtgtatggtcgtggggcgtatgcgatata 

1 MIKKYTGDFETTTDLNDCRVWSWGVCDI 

2236 gacaacgttgacaatatgacgttcggtttagaaatcgattctttttttgagtggtgtaaaatgcaaggcagcacagacatttat 

29 DNVDNMTFGLEIDSFFEWCKMQGSTDIY 

2320 ttccacaacgaaaaatttgacggagagtttatgctttcatggttattcaaaaatggtttcaaatggtgtaaagaagcaaaagaa 

57 FHNEKFDGEFMLSWLFKNGFKWCKEAKE 

2404 gatcgaacattctccacactcataccaaatatgggtcaatggtatgctttggaaatttgttgggaagttaattacacaacaaca 

B5 DRTFSTLISNMGQWYALE ICWEVNYTTT 

2488 aaatcaggtaaaacgaaaaaagagaaatctcgaacaataatttatgatagccttaaaaaatatccttttccagtgaaacaaatt 

113 KSGKTKKEKSRTI IYDSLKKYPFPVKQI 

2572 gcagaagcttttaattttcctataaaaaaaggcgaaatagattatacaaaagaaagacctattggttacaaaccaacaaaagat 

141 AEAFNFPI KKGE IDYTKERPIGYKPTKD 

2656 gaatgggagtatttaaagaacgacactcagattatggcgatggcattaaaaattcaattcgatcaaggactaactcgaatgact 

169 EWEYLKNDIQIMAMALKIQFDQGLTRMT 

2740 agaggaagcgacgctttaggcgactacaaagattggctaaaagctacacatggaaaatcaactttcaaacaatggtttcctatt 

197 RGSDAIiGDYKDWLKATHG-KSTFK-Q W""F.„ A' T * 

2824 ttgtctttagggtttgataaagacttacgtaaagcatacaaaggcggcttcacttgggtaaacaaagtttttcaagggaaagaa 

225 LSLGFDKDLRKAYKGGFTWVNKVF~QGKE 

2908 ataggtgacggcattgtctctgacgtcaactctttgtatccctctcaaatgtacgtaagacctttaccatatggaacacctcta 

253 IGDGIVFDVNSLYPSQMYVRPLPYGTPL 

2992 ttctacgaaggagaatacaaaccgaacaacgactatccgctgtacattcaaaatatcaaagtaagattccgtttaaaggagggt 

281 FYEGEYKPNNDYPLYIQNIKVRFRLKEG 

3076 tatattccaaccattcaagttaagcaaagttcattattcattcaaaacgaatatcttgaatcaagtgtaaacaagttaggagtt 
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309 YIPTIQVKQSSLFIQNEYLESSVNKLGV 

3160 gacgaattaatcgatcttactcttacaaatgttgacctagaattattttttgaacactacgatattttagagatacattacact 

337 DELIDLTLTNVDLELFFEHYDILEIHYT 

3244 tacggatatatgttcaaagcttcttgtgatatgttcaaaggctggatcgataaatggatcgaagtaaagaacaccaccgaaggg 

365 YGYMFKASCDMFKGWIDKWI EVKNTTEG 

3328 gctagaaaagctaacgccaaaggtatgttaaatagcttgtatggaaagttcggaacaaaccctgacattacaggaaaagtgcct 

393 ARKANAKGMLNSLYGKFGTN PDITGKVP 

3412 tacatgggcgaggacggcattgttcgattgacactaggagaagaagaattaagagatcctgtttatgttccgcttgctagtttt 

421 YMGEDG IVRLTLGEEELRDPVYVPLASF 

3496 gtgacggcttggggtagatatactaccattacaaccgctcaaaaatgttttgatcgcattatttatcgtgatacagatagcatt 

449 VTAWGRYTTITTAQKCFDRI IYCDTDSI 

3580 catctagtaggaacagaagttccagaagcaatcgatcacttggttgatcctaaaaaacttggttattgggggcatgaaagcaca 

477 HLVGTEVPEAIDHLVDPKKLGYWGHEST 

3664 tttcaacgagcaaaattcattcggcagaaaacatacgtagaagaaattgatggcgaattaaatgtaaagtgtgctggtatgcca 

505 FQRAKFIRQKTYVEEIDGELNVKCAGMP 

3748 gatcgaataaaagagattgtaacttttgacaattttgaagttggtttttcaagctatggaaagctgctacctaaaagaacacaa 

533 DRIKE I VTFDNFEVGFSSYGKLLPKRTQ 

3832 ggtggcgtggtattagtagacacaatgtttacaatcaaataa 3873 

561 GGVVLVDTMFTIK* 
182ORF003 

11305 atggaagaacgaattgatattcaaatgaacaagatgaaagaagaaaatcaaaagaattacctattgcaccctgaaacgaacccg 

1 MEERIDIQ MNKMKEENQKNYLLHPETNP 

11389 aaacaagttgtttttgatgaaacattgcatggaaatgaaaatcaggagagtttcaacaattttgttgacacaagaaaaatgaca 

29 KQVVFDETLHGNENQESFNNFVDTRKMT 

114 73 act acaattgatgtaagt get tatggggttatcgctgacggtgtaacagattgtacaccaatattaaataaat tact tgaagaa 

57 TTIDVSAYGVIAD GVTDCTPI L N K L L E E 

11557 aaaagcgaaatgggtatcactttttattttcctccttgtgaacgtgattcatattatcgctttgctaacaccattgaattgaaa 

85 KSEMGI TFYFPPCERDSYYRFANTIELK 

11641 cgtgatgtacctgtagttactttcttaggatcgggagaaacgacattaaagtttgaaacaatgacggcatttaatgtaaacatc 

113 RDVPVVTFLGSGETTLKFETMTAFNVNI 

11725 gaaagtttcaatattgatggttttgcattatggttgccacaaggcgctcaaagtggtaaaggaattttctttaatgatactcgc 

141 ESFNIDGFALWLPQGAQSGKG I FFNDTR 

11809 aattacaatcgttttgactttgatttgtttgttcgtaactgtactttaaatgaaggaacgtatgttgttgttgctagaggtaga 

169 MYNRFD FDLFVRNCTLNEGTYVVVARGR 

11893 ggggttacatttgaaaattgtctattctctaatatctctcaagcaattatcaaaacagcttttcccgatgtaaatggtatgtgg 

197 GVTFENCLFSNISQAI IKTAF PDVNGMW 

11977 caagggaacgatatcaatactaggggtacaggttttagaggtttctttgtgaaaaacaaccgtattcatttttgtacagcgatc 

225 QGNDINTRGTGFRGFFVKNNRIH' FCTAI 

12061 attatcgacaatgacgatgattatcagaatgtaattaatttctgtgaaatttctggtaacacaatcgaaggtggcgtaagttat 

253 I IDNDDDYQNVINFCEISGNT I EGGVSY 

12145 tatcgaggatatgcgcataacttgcatgtccaaaacaacaaccattttctagcatacggaaatagaaacgctttgtttgagttt 

2B1 YRGYAHNLHVQNNNHFLAYGNRNALFEF 

12229 caagatgtggatcaagcttatattgatgtagatgtttattgtcgtaactcacaagtcgagggaatgaatagtacagctatttca 

309 QDVDQAYIDVDVYCRNSQVEGMNSTAIS 

12313 cgtttaattgttgtttacggacattaccgaaacttaaagattacaggtaaattatatcgttgtcaaggacatgttatcacgttg 

337 RI»IVVYGHYRNZ*KITGKLYRCQGHVITL 

12397 tatggcggtggcgttaatttctattgtgacttgatggcacaagaagcacctttgacggacggttaccggtttattcaaacggct 

365 YGGGVNFYCDLMAQEAPLTDG YRF IQTA 

12481 gacaatcgagttaactatgatgggtttgttgttcgtggtttgtctaattcaacaaaagtaaatacaccaatgatctataaagca 

393 DNRVNYDGFVVRGLSNSTKVNTPMIYKA 

12565 cctcagactgttttctataatcgtagaatcgatcatgtgctaacaggtccaaatgcaagtaatgtatataactag 12639 

421 PQTVFYNRRIDHVLTGPNASNVYN* 
1820RF004 

4626 atggctgacaaaatcacagaacaagatgttcttcgtgccacaaatgtagaaacaccagtacaattaatgactgctatttataat 

1 MADKITEQDVLRATNVETPVQLMTAIYN 

4710 agttcatcatctctttttcaggcgaacgtacctatgccaaatgcagataacatcgaagcggttggtgcagggatcacacgttta 

29 SSSSLFQANVPMPNADNIEAVGAGITRL 

4794 gacgtagtaaaaaacgaatttatttcaactttagttgaccgtattggtaaagtagttatccgatacaaatcttggcgtaaccct 

57 DVVKNEFI STLVDRIGKVVIRYKSWRNP 

4878 ttgaaaatgtttaaaaaaggaaacatgcctttaggtcgaacgattgaagaaatttttgttgacattgcacaggaacataagttc I 

85 LKMFKKGNMPLGRTIEEI FVD IAQEHKF 

4962 aaccctgacgagtctgttacaggggtatttaaacaggaagttcccgatgtaaaaacattgttccacgaaattaatcgtgaaggt 

113 NPDESVTGVFKQEVPDVKTLFHEINREG 

5046 tactacaaacaaacgatccaagaagcatggttagaaaaagcatttacttcatgggataatttcaatagtttcgttgctggtgta 

141 YYKQTIQEAWLEKAFTSWDNFNSFVAGV 

5130 atgaacgctttatacacaggtgacgaagtaagcgaatttgaatacacgaaattattaatagcaaactaccaagaaaaagagcta 

165 MNALYTGDEVSEFEYTKLLI ANYQEKEL 

5214 ttcaaagagatcgaaattggcgaaattactgaatcaaatgcaaaagaatttatccgtaagatcaaatcaacctctaacaaatta 

197 FKEIEIGEITESNAKEFIRKI K S * T S _H TC L 

5298 gaatttatgagttccgcttacaacgctcaaggagttaaaacatctacctcaaaatctgatcaatacgttat&attgacgccgac 

225 EFMSSAYKAQGVKTSTSKSDQYVI IDAD 

5382 acagacgcaaccattgacgttgacgttttagcagcggcattcaatatgagtaaaactgactttgtaggacacaaaatcgttatt 

253 TDATIDVDVLAAAFNMSKTDFVGHKIVI 

5466 gatgagtttcctaaaaaagaaggcgaagaatcgtcaaatattgtggcagttattgtagatagtgaatggtttatgatctacgac 
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281 DEFPKKEGEESSNIVAVIVDSEWFMIYD 
5550 aaattgtacaaaacaacaagtctatacaaccctgaagggttatattggaattattggttgcaccaccaccaactatattctact 
309 KLYKTTSLYNPEGLYWNYWLHHHQLYST 
5634 tctcaattcgggaacgctgttgcttttgttaaatcagcaacaaaacccgtcacaaaagttgcttttgcaagtgcaacaactagt 
337 SQFGNAVAFVKSATKPVTKVAFASATTS 
5718 gttgttaaaggatcatctaaagatatcgcattgacatttacaccagtagaagcaacaaaccaacaaggagaagttgtttcatca 
365 VVKGSSKDIALTFTPVEATNQQGEVVSS 

5802 gcaccagcattggttaaggcaaccgtaaaacaaacagcaggtaaagcgactgccgtaaccgtagaaggcttagaagtcggtcaa 

393 APALVKATVKQTAGKATAVTVEGLEVGQ 
5886 tcattagtaacattcacagctatcggaggtcaacaagcaacggttcctgttacggttacttctgactaa 5954 

421 SLVTFTAIGGQQATVLVTVTSD* 
182ORF005 

12651 atggcaactcttacaaatgaacaaatagctagaggacaaacaatcgctaaaatactttcaaaatatggctataataaaaattca 

1 MATLTNEQIARGQTIAKILSKYGYNKNS 

12735 caagtaggagctgtcgccaatctccattgggaatcggctggtttgaacccgaacagcaatgaatatggtggaggcggatatggg 

29 QVGVVANLHWESAGLN PNSNEYGGGGYG 

12819 ttaggtcaatggacgcctaaaagcaatctttatcgccaagcacaaatttgtgggttgtctaatgctaaagctgaaacgttggaa 

57 I*GQWTPKSNLYRQAQI CGLSNAKAETLE 

12903 ggtcaagcagagatcatcgctcaaggggataaaacaggtcaatggatggataatacacctgtttcttctgcaggttatactaac 

85 G Q A E I IAQGDKTGQWMDNTPVSSAGYTN 

12987 cctcagaccctttcagcatttaaacaatctgcaaatattgatgttgctacaattaattttatgtgtcactgggaacgccctggt 

113 PQTLSAFKQSANIDVATINFMCHWERPG 

13071 aaacttcatatcgaagaaagacttgatcttgcacaagcttatagtaagcatattgacggtagcggtggcggtggcgtaaaacgt 

141 KLHIEERLDLAQAYSKH IDGSGGGGVKR 

13155 tgccatggaaccccaatcaagaatacaaatcttgatcctaaaagtttcatgagtggacaactttttggcacgcatgcaggaaac 

169 CYGTPI KNTNLDPKSFMSGQLFGT.HAGN 

13239 ggcagaccaaataatttccatgatggttcggactttggttcaattgatcaccctggcaatgaaatgattgcatgttgcgatgga 

197 GRPNMFHDGLDFGSIDHPGNEMIACCDG 

13323 acagtaacacatgttggaacaatgggagcattaagagcgtattttgtgataaatgatggtacttacaatatcgcttatcaagaa 

225 TVTHVGTMGALRAYFVINDGTYNIVYQE 

13407 ttcagttataaccagtcaaatataaaggtaaaagttggcgacaaagttaagaacggacaagtttgcgcaatacgtgacgcggat 

253 FSYNQSNIKVKVGDKVKNGQVCAIRDAD 

13491 catttacatttaggttttactaaaaaagattttatgactgcgttaggatcttctttcatagacgat'ggaacatgggaagaccct 

281 HLHLGFTKKDFMTALGSSFIDDGTWEDP 

13575 ttgaagtttccagggcaatgttttggagatggagacactggcggagataatgacgataacaataaggataaaaatgatcttatt 

309 LKFLGQCFGDGDTGGDNDDNNKDKNDLI 

13659 tatctattgctatccgatgccttgaatggttggaaatttcaa 13700 

337 YLLLSDALNGWKF* 
182ORF006 

14995 acgacaaatagcttaggcgttaaacttgaagagaaaaacttatactataaccctaacaatgctttaggttttaattgcctaatg 

1 MTNSLGVKLEEKNLYYN PNNALGFNCLM 

IS 07 9 ttgtttgtaataggcgcacgtggtataggtaaaacctatggttataaaaaatttgttgttaatcgctttattaaacacggcgaa 

29 ^fvigargigktygyk'kfvvnrfikhge 

15163 caatttatttatttaagaagattcaaaacagaacttaaaaagattcctcaatttttcaaaacaatggcgaaagaatttcctgat 

57 Qfiylrrfktelkkipqffktmakefpd 

15247 cataaacttgaagtaaaaggaaaagaattctattgtgatgataaattaatgggttgggctgttccacttagtacgtggggaatt 

as hklevkgkefycddklmgwavplstwgi 

1S331 9aaaaatctaatgaatatcccgaagttcgtacaattttgtttgatgagtttttaattgagaaatcaaaaatcacttatttacca 

113 EKSNEYPEVRTILFDEFLIEKSKITYLP 

15415 aacgaagctgaagccttattgaacatgatggaaacggttttccgaagacgtacaaatacaagatgtgttatgttgagtaatgca 

141 NEAEALLNMMETVFRRRTNTRCVMLSNA 

15499 actagtgtagtgaacccttatttcttgtatttcaatctgcagccagatttgaacaagcgttttaatctatatcaagatcgaggt 

169 TSVVNPYFLY FNLQPDLNKRFNLYQDRG 

15583 atattgattgaattgtgtgattcaaaagactttgcagaagtgaagagagaaacaccttttggtagattgactcgtggaacagaa 

197 ILIELCDSKDFAEVKRETPFGRLIRGTE 

15667 tacgaagattttagtatcaacaatgagtttgtcaatgatagtgatacgtctattgaaaagagaagcaaaaatagtagtttctta 

225 YEDFSINNEFVNDSDTFIEKRSKNSSFL 

15751 tgcgccattgcttttgaagggaaaatctttgggtattggatagacgctgaaacaggttgtgtctatgtgagttatgattatcaa 

2S3 CAIAFEGK1FGYWIDAETGCVYVSYDYQ 

15835 ccaaatacaaatcatttttatgcaatgactacgaaagaccatgaagaaaatagattgctgatgaaaaatcggcgaaataattat 

281 PNTNHFYAMTTKDHEENRLLMKNWRNNY 

1S919 tatctttcaacagtggcgaaagcattcaagaatagttatctgcggtttgataacattgttatcaagaatttacactatgattcg 

309 YLSTVAKAFKNSYLRFDNIV1 KNLHYDL 

16003 tttaataagatgaaaatctggtaa 16026 

337 FNKMKIW* 



182ORF007 

7795 atgagtagacgaaaaggtgcaggacttgctagaaataaccgttatacagcaaaaagcagaccctatccaaatgaaccctattca 

i msrrkgaglarnnrytaksrpypnj:p"ys 

7879 agtgatgtagaagaaatcagctactatgaacattatcgtagacaactcacgctccttacgttccagttgttcgaatgggaaaat 

29 SDVEEISYYEHYRRQLTLLTFQLFEWEN 

7963 ttgccaaaaccaattgaccctcgttatttagaaattgctttacacactaacggttatcttggtttctttaaagaccctacacct 

57 L P K S idpryleiai>htngylg ffkdptl 

8047 Sggtccatggtttgcgcaggggcagaagacggtcaaatcgaccatcaccacaaccctattttctttacagcaaacgaagcaatg 
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85 GFMVCAGAEDGQIDHYHNPI FFTANEAM 

8131 tatcacaagagatatcctgttttaagatatgatgatgatgatgataaatcaaaatgtatcacgctgtataataatgacttgaaa 

113 YHKRYPVLRYDDDDDK S KCIMLYNNDLK 

8215 gttcctacgttaccaagtttacatcgttttgctttagatatggcggacataaaccagatatcacgagtgaatcgaagagcgcaa 

141 VPTLPSLHRFALDMADINQI SRVNRRAQ 

8299 aaaacacctgtaattattcaaactgatgaaaagaaatacttctcattgctacaagcttataaccaaattgacgaaaataatcag 

169 KTPVIIQTDEKKYFSLLQAYNQIDENNQ 

8383 gctgtttttgtggataaagatatggagtttgacgaatcttttaatgtatggcaaacaaatgctccatatgtagtagataaacta 

197 AVFVDKDMEFDESFNVWQTNAPYVVDKL 

8467 cgatcagaattgaacgaagtatggaatgaagtgttaacttttctaggtatcaacaatgctaacgtagataagactgcacgtgta 

225 RSEIiNEVWNEVLTFLGINNANVDKTARV 

8551 caaacatcagaagtcttatctaacaatgaacagattgaaagtccaggtaacatcttgttaaaatcaagaaaagagttttgcgat 

253 QTSEVLSNNEQIESSGNILLKSRKEFCD 

8635 cgtgtaaatcgtgtctttggcgatgaacttgacggaaagattgacgtgaagtttagaacagacgccgttcgacaattacaactg 

281 RVNRVFGDELDGKIDVKFRTDAVRQLQL 

8719 gcggcaggtcaatcaaaaaaagaccagatgagtggagggtcgccaagtgctacttaa 877 S 

309 AAGQSKKDQMSGGLPSAT* 
182ORF008 

14105 atgatgaatggtattgatatctctagttatcaaacaggaattgatctttcaaaagttccatgcgattttgtaaatattaaagca 

1 MMNGIDISSYQTGIDLSKVPCDFVNIKA 

14189 acaggcggaacaggttatgtaaaccctgattgtgaccgagcatttcaacaagctttgtctttaggtaaaaagattggtgtgtat 

29 TGGTGYVNPDCDRAFQQALSLGKKIGVY 

14273 cattttgcgcatgagaggggtttagaaggtacacctcaacaagaagcgcaattctttttagataatattaagggttacattggt 

57 HFAHERGLEGTPQQEAQFFLDNIKGYIG 

14357 aaagctgttcttattcttgactttgaagggtcaaatcagaaagatgtaaattgggcgaaagcatttcttgattatgtttataat 

85 KAVLILDFEGSNQKDVNWAKAFLD YVYN 

14441 aaaacaggcgttaaagcatggttttatacgtatacagcaaacctcaatacaactgatttttctagtattgcaaaaggcgattat 

113 KTGVKAWFYTYTANLNTTDFSSIAKGDY 

14525 ggtttatgggttgctgaatatggatcaaatcaaccacaaggctactctcaaccagcgccacccaaaacaaataattttccaatt 

141 GLWVAEYGSNQPQGYSQPAPPKTNNFPI 

14609 gttgcctgttttcagtttacaagtaaaggacgtttaccaggatacaacggcaatcttgatttgaatgttttctatggcgatggt 

169 VACFQFTSKGRLPGYNGNLDLNVFYGDG 

14693 aatacatgggatctgtatgtaggtaaaaaacaggaccaaattgttcctcctgaaaataaaatacttgacgccacaagtgatgag 

197 NTWDLYVGKKQDQIVPPENKl FOATSDE 

14777 tttattttcactcttacaacaggtagcacaagcgtgttttatttcgacggagaaacgatctttgaattgtctgatccaacacaa 

225 FI FTLTTGSTSVFYFDGETI FELSDPTQ 

14861 ctcgatcatattagaggaacatacaatcatgttcatggaaaagaaatcccatcaatggtgtggacacctgaacaatttgatatt 

253 LDHIRGTYNHVHGKEI PSMVWTP EQFD I 

14945 tacttaaaaatgtatgaaaagaaaccagtatataaatag 14983 

281 YLKMYEKKPVYK* 
X82ORF009 

8765 gtgctacttaaacgttatattgaaagtttcacttattaccaacctgaattatctcgaaaagaacgtattgaagttggccgaaaa 

1 VLLKRYIESFTYYQPELSRKE RIEVGRK 

8849 caattgtttgattttgattatccgttttatgacgaaacaaaacgagcagaatttgaaacaaaacttatcaatcacttttacttg 

29 QLFDFDYPFYDETKRAEFETKFINHFYL 

8933 agagagataggctcagaaacgatgggatcatttaagtttaatcttgacgaatatttaaatctaaacatgccctattggaataaa 

57 REIGSETMGS FKFNLDEYLNLNMPYWNK 

9017 atgcccctatcaaatcttgaagagtttccgatttttgatgacacggactacaccattgatgagaaacagaaattgttaaatgag 

85 MFLSNLEEFPIFDDMDYTIDEKQKLLNE 

9101 attgatacaaacatcaaagcgaatcgtgatgaatcgaagaaccaaacgaagcaagtagatcaaacagacaacagaaacaaaaat 

113 IDTNIKAMRDESKNQTKQVDQTDNr. NKN 

9185 acacgtgacacaggaacaaccgattctttctcaaggaacacttatacagacacccctcaaaaagatttgagaattgccagcaat 

141 TRDTGTTDSFSRNTYTDTPQKDLRIASN 

9269 ggagatggaacaggtgtaatcaattatgcaacaaatatcacagaagatttgagtaaagaaacaacaagctccacaggcgttgaa 

169 GDGTGVINYATN ITEDLSKETTSSTGVE 

9353 acaaacaacgacaaaacaaatcaaaatacacgaagcaatgcttctgaaaaagaaacaaagaacacagacattaataaagatcaa 

197 TNNDKTNQNTRSNASEKETKNTDINKDQ 

9437 aatcaaaccaaagatacgattacacgatataaaggtaaaaagggaaacactgattatgctgacttactcgaaaaatatcgtaga 

225 NQTKDTITRYKGKKGNTDYADLLEKYRR 

9521 agtgttttgagaattgagaaaatgatctttagagaaatgaacaaggaaggctcatttctccttgtttatggagggaggtag 
9601 

253 SVLRIEKMIFRBMNKEGLFLLVYGGR* 



182ORF010 

1310 ttgaccgtaagaataccaaagaatgatagagccaagttagagaaaacctacggtaaatctaacaaagctcgtaaaaaatacaat 

1 LTVRISKNDRAKLEKIYGKSNKARKKYN 

1394 cgtttaagacaaaaaggagttgaggaaaggcaacttccaactgtcccaacatcaaagaaaagacttattgactacgtaaaatca 

29 RLRQKGVEERQLPTVPTSKKRLIDYVKS 

1478 acaaatatgagtcgtagtgattttaacaagatgttagacgagttggtagattttgcacaacctcacaacgagaattaoattttc 

57 TNMSRSDFNKMLDELVDFAQPYNEJtf~YIF 

1562 gagatcaacaagcgaaatgttgcaatctcaagagcgcaaatcaaagaagcgcaaattaaaacagagcaagctcaaaaagcgaaa 

85 EINKRNVAI SRAQIKEAQI KTEQAQKAK 

1646 gaagaacactacaaagagcttaacaaagttgaagttaagaagcccacagaaaacacaattgtcacaccaactattttaacagag 

113 EEHYKELNKVEVKKPTENTIVTPTILTE 

1730 ttaggtgctgacttaccttttcaagcaataccagattttaatatcgacgctttcacttctccagaaggagttcagtcttattta 
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141 LGADLPFQAI PDFNIDAFTSPEGVQSYL 

1814 gaaaatataggaaaacaagacgaacaatattttgacgaaagagaccaactttattacgacaatttcagacaagcgatgtttact 

169 ENIGKQDEQYFDERDQLYYDNFRQAMFT 

1898 attttcaattcagacgctgacgatattgttcgtttacttgactcaatggggcttgatctatttatgaaaacatatgttagtaac 

197 IFNSDADDIVRLLDSMGLDLFMKTYVSN 

1982 ttcttagacatgaaccttgactacatttaegacgaagcagaagtacaacagaaaaaagaacaagtttacagtaagattgcaaaa 

225 FLOMNLDYIYDEAEVQQKKEQVYSKIAK 

2066 gtgatcgagtctgaaacaggtggagaagtcccctcatataaccccacgaagaacatcacaattaattcagaaacaggagaagaa 

253 VI ES ETGGEVPSYNPTKNITI NSETGEE 

2150 ttatga 2155 

281 L * 
1S2ORF011 

9607 atggtagattttaaccccgacaagcggtttgacggtttacccgctgtattcaaagaacgctttagcaaatatcctcatactgaa 

1 MVDFNPDKRFDGLPAVFKERFSKYPHTE 

9691 tacagatatgaattactattagatgaagaagtatcggctttaattgcctatctgaatgaagttggtgctttagttaatgatatg 

29 YRYELLLDEEVSALIAYLNEVGALVNDM 

9775 agtggttatttaaattactttatcgaacattttgttgagaagttagaagagatcacaaatgacacactcaaaaaatggttgtct 

57- SGYLNYFIEHFVEKLEEITNDTLKKWLS 

9859 gatggtacgttagaaaatttaatcaatgatactgtttctgcaaattatatcaaagaaatcaaaagattacaaatcttggttgct 

85 DGTLENLINDTVFANYIKEIKRLQILVA 

9943 gaaacacgtgctaacagtgtgaatattcttttgacaaaaaataaaccggatgttgctgatgatcgaacattttggtataagatt 

113 ETRANSVNILLTKNKPDVADDRTFWYKI 

10027 caacgcgacaatactgattatggagccgatcctattgacacgttacgtattgttgcaatcaataaagttagtggctggaatacc 

141 QRDNTDYGADPIDTLRIVAINKVSGWNT 

10111 gctacaggagatatttatcttaacattaaaggaacggagggtgtataa 10158 

169 ATGDIYLNIKGTEGV* 
182ORF012 

10872 atggcaaataaaaatattcaaatgaaggatagcaatgacaataatttatatccaagtgttcgagcagaaaacttgttagatttg 

1 MANKNIQMKDSNDNNLYPSVRAENLLDL 

10956 accagtcgtgccgaattaacaatgacaaattgtcaattatatgcagctggtgataaaacaaatgcaatctcttatctcggtgca 

29 TS RAELTMTNCQLYAAGDKTNAI SYLGA 

11040 gtaggtatgctcgaaggtatgataaagtttactgaaagtttgacaaaccctgtgatcacaacgctaccagaaggttttagacca 

57 VGMLEGMI KFTESLTNPVI TT LPEGFRP 

11124 ataagaacaaaacgtattggttgtttcgcaaaatattacacaccaaatccaacagatacaaaagaaatggtttatgtatcaatc 

85 IRTKRIGCFAKYYTPNPTDTKEMVYVSI 

11208 acacctgatggcaaagtaactgtaaatgacaatgtaggtaaaatcgaatatctatccctagataattgcgttttccctctaaaa 

113 TPDGKVTVNDNVGKIEYLSLDNCVFPLK 

11292 taa 11294 

141 * 
182ORF013 

10456 atggcagataaaaatattcaaatgcaggataaagatcataatcgtttaatgcctgttacaattgetaaaaatgttctaacaggc 

1 MADKNIQMQDKDHNRLMPVTIAKNVLTG 

1054 0 gactctaatcttgaattagttaatgccgaaataagaggtaacgctagtgaagctaaaacacttgcacaacaagctaaagaaact 

29 DSNLELVNAEIRGNASEAKTLAQQAKET 

10624 gctgctggtttgtcaacagaaattgacacagtaacatcaaccgcaaatcaagcgttgacgaaggctggtacagcacaacaaacc 

57 AAGLSTEIDTVTSTANQALTKAGTAQQT 

10708 gcagaacaagcgaaaacaacagcaaacagtatcagcgcagttgcaacggcagctaaaaacacagctgattcagcacaaaaaagt; 

85 AEQAKTTANS I SAVATAAKMTADSAQKS 

10792 gcaactgatctagctgttcgagtaagcagtttagaggacacagcaatacaatatactgtattaccatag 10860 

113 ATDLAVRVSSLE DTAIQYTVLP* 
182ORF014 

13716 atgatagaatatatcacacaatggttggcagatgataatcatcttgtttatggtttgattatacggttaatggttgcaatgatt 

1 MIEYITQWLADDNHLVYGLI IWLMVAMI 

13800 atcgattttgtgttaggttttacaattgccaaatttaacaaggaaatcgactttagtagttttaaagctaaagcaggtatcatt 

29 IDFVLGFTIAKFNKEIDFSSFKAKAG II 
13884 . gttaaggtggcagaaatggttttagtggtttactttattcctgtagcagtaaaactcggtgcagtaggtattacaatgtatata 

57 VKVAEMVIiVVYFI PVAVKFGAVGITMYI 

13968 acaatgttggttggtttgattttatcagaaatttatagtatactaggacatatttcagatatcgatgatgataataattggact 

85 TMLVGLILSEIYSILGHI SDIDDDNNWT 

14052 gattatgttaagaagtttttagacggaacactcaacagaaaggacgatattaaatga 14108 

113 DYVKKFLDGTLNRKDD I K * 
182ORF015 

854 atggaaatcgtaaaaagcacatttgacacacaaacaccagaaggaatgttacaagtattcaacgccacaaacggggcttcaatt 

1 MEIVKSTFDTQTPEGMLQVFNATNGASI 

938 ccgttacgtaacgcaattggcgaagtactagaattgaaagatatcctagtttactcagacgaagtttctggttttggtggagcc 

29 PLRKAIGEVLELKDILVYSDEVSGFGGA 

1022 gaaccatcacaagcagaactagtcgctttcttcacagaagatggtaaaacttatgcgggtgcatcagcagtagcaacaaaatca 

57 EPSQAELVAFFTEDGKTYAGVSAVATKS 

1106 gctaaaaacctaattgatatgatgactgccaaccctgacatcaaaccaaaaatttcttttgtcgaaggaaaatcaaacggtgga 

85 A K N L I DMMTANPDIKPKI'S FVEG-K S^N- G 

1190 caaaaatttgtaaatctacaagtggtctcactgtag 1225 ^ 

113 QKFVNLQVVSL* 
1820RP016 

17033 atgattaacaatttatcattaattttagagggtttaaaccaactaactaaagatgacaacgatagtttagcgtctatcaagtca 

1 MINNLSLI LEGLNQLTKDDNDSLASI KS 

16949 gaaataacacaaggaggaaaacaattaattttatacattgattacgttacaaaagagttcgtgttaacacatgataaatataac 
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29 EITQGGKQLILYIDYVTKEFVLTHDKYN 

16865 tatgtttatcttgatagccattgcattaatatcgcaataacgaaatcaatgaaaagcgttgaacactatgcggaacaattgaaa 

57 YVYLDSHCINIAITKSMKSVEHYAEQLK 

16781 catgacggatataaacaaattacggacaaacag 16749 

85 HDGYKQITDK* 
182ORP017 

154 atgaaatattcactacaacaaatagaegaaattaaatcaacaattttcagaattagattaaaaaggcatgaactagaggaattg 

1 MKYSLQQIDEIKSTIFRIRLKRHELEEL 

238 gtggacgaagtaaacgatattgctaaagatccggaggaaagatatcttttatcgttttattacacagaagaagaacgtttgttt 

29 VDEVNDIAKDPEERYLLSFYYTEEERLF 

322 gaaattccctctgcaagattaatagattattacaacgaaaagatcacaaatctgaaatcggaaatcatatcactcgaaaaaaga 

57 EIPSARLIDYYNEKITNLKSEIISLEKR 

406 ttacaaaaactagtaaaataa 426 

85 LQKLVK* 
182ORF018 

16737 atgattgcacgaacattcaaagaacaccgcgaactaactgaatggttacgtttctactgtaaacgtaacctttcagacaatgaa 

1 MIARTFKEHRELIEWLRFYCKRNltSDNE 

16653 aaaatagagatcatagaggggactttacaagatttcgacgttccggaaataaatatcaccgaacttttgttaactcattcaacg 

29 KIEIIEGTLQDFDVPEINITELLLTHST 

16569 ctattacccgaatcgagtcaatttaacattcttgaaaagtattgtcaggcaatgaaattagtaacttcatacgtaaaagttggt 

57 LLPESSQFNILEKYCQAMKLVTSYVKVG 

16485 tctcgctatcagttagcgttacaaataccaaaaggctatttaaaggaggtggaataa 16429 

85 SRYQLALQI PKGYLKEVE* 
182ORF019 

4323 atggaaattaaagaacatgaatcaattttaaatggtattcttgaaagtgtcacagacggtgaagcaagatcaaagattgtagaa 

1 MEIKEHESILNGILESVTDGEAR S KIVE 

4407 catcttgaagcattgcgagaagactacggagcaacaactgaagctttgacatcagcaaatagcacacttgaaaagttaaagaaa 

29 HLEALREDYGATT EALTSANS TLEKLKK 

4491 gataacgaagcgttggttatttcaaactcaaaattgttccgagaacgagcgatcgtagaaccagcagaaaataacgaaccagaa 

57 DNEAliVISNSKLFRERAIVEPAENNEPE 

4575 acagaccagaatattacactagacgatttaggaatttaa 4613 

85 TDQNITLDDLGI* 
182ORF020 

10158 atggcagacattagaacacaactaacaagtgaagatggatcagacaatttatttccaatttcaaaagccgttaatattatgact 

1 HADZRTQLTSEDGSDNLFPIS KAVNIMT 

10242 aatagcggtacgaatgtagaaggagaattgggtacactcaaacaaaatgacgaaacaatgaatacctcagttcaaaatgctgta 

29 NSGTNVEGELGTLKQNDETMNTSVQNAV 

10326 gttactgccaatcaagcaaaagattctgtagctgaattaaatgtaaatgttggtaaactaaccaatcgaataacaacattagag 

57 VTANQAKDSVAELNVNVGKLTNRITTLE 

10410 agtacagtggctaatcttgatggtattcgttatgtagaggtgtaa 10454 

85 STVANLDGIRYVEV* 
182ORF021 

17339 atgaacaataaatcattaatagctgaaaaaggagaggtatctctacttcacccctttaatgagtgggatatgaattatcatatc 

1 MNNKSLIAEKGSVSLLHPFNEWDMNYHI 

17255 atagataccgaaaacaataaacattatcttattgatattgatgaggtaggcgatgaggaatattgtttgttatcttttgaagaa 

29 IDTENNKHYLIDIDEVGDEEYCLLSFEE 

17171 ctaaaggaattagatatggatcttatttccgagtattcatggaaaactacagaaataacatattaa 17106 

57 LKELDMDLISEYSWKTTEITY* 
182OR7022 

12868 gtgggttgtctaatgctaaagctgaaacgttggaaggtcaagcagagatcatcgctcaaggggataaaacaggtcaatggatgg 

1 VGCLMLKLKRWKVKQRSS LKG I KQVNGW 

12952 ataatacacctgtttcttctgcaggttatactaaccctcagaccctttcagcatttaaacaatctgcaaatattgatgttgcta 

29 I IHLFLLQVILTLRPFQHLNNLQILMLL 

13036 caattaattttatgtgtcactgggaacgccctggtaaacttcatatcgaagaaagacttgatcttgcacaagcttatagtaagc 

57 QLI LCVTGNAttVNFISKKDLI LHKLIVS 

13120 atattgacggtagcggtggcggtggcgtaa 13149 

85 ILTVAVAVA* 
182ORF023 

12189 atggttgttgttttggacatgcaagttatgcgcatatcctcgataataacttacgccaccttcgattgtgttaccagaaatttc 

1 MVVVLDMQVMRI SS I ITYATFDCVTRNF 

12105 acagaaattaattacattctgataatcatcgtcattgtcgataatgatcgctgtacaaaaatgaatacggttgtttttcacaaa 

29 TBINYILIIIVIVDNDRCTKMNTVVFHK 

12021 gaaacctctaaaacctgtacccctagtactgataccgttcccttgccacataccatttacatcgggaaaagctgttttgataat 

57 ETSKTCTPSIDIVPLPHTIYIGKSCFDN 

11937 tgcttgagagatattagagaatag 11914 

85 CLRDIRE* 
182ORF024 

6174 atgcttgtaactatctcatctttaaaaacgaagaaacttatcctagtaaatggcagtatgcctttgttactgatattgaatata 

1 MLVTISSLKTKKLILVNGSMPLL-L I^J. — N Y 

6258 agaatgacaacacaagtttcgttacctttgaaattgatgttttacaaacttatcgtttcgatattggtatacgagaaagtttca 

29 RMTTQVSLPLKLMFYKLIVS I LVYEKVS 

6342 ttgcaaaagaacaccctcaactttattattcgaatggaatacctttcattaatacaattgaagagtcgcttgattacggtagag 

57 LQKNTLNFI IRMEYLS LIQLKSRLITVE 

6426 aatacacaacaacaaatgtaa 6446 

85 NTQQQM* 
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182ORF025 



54 8 atgggtcgaaaactaatgcaacgaaacgtaacatcaactaaagtagaattctcagaagttatcgtacaagatggagcgccaaca 

1 MGRKLMQRNVTSTKVEFSEVIVQDGAPT 

632 attgcaccatgcgaaccagttgtcttaacaggaaaactttcagaagaaaaagctttatcagcgatcaaacgtaaaaaccccgat 
29 IVPCEPVVLTGKLSEEKALSAIKRKNPD 

716 aaaaacgtagttgtaacaaatgtttcacatgaaacagcgctttacacaatgccagtcgataaatttatcgagttagcagacaaa 

57 KNVVVTNVSHETALYTMPVDKFIELADK 

800 tcaacacaagcctaa 814 

85 S T Q A * 
182ORF026 

13259 atggaaattatttggtctgccgtctcctgcatgcgtgccaaaaagttgtccactcatgaaacttttaggatcaagatttgtatt 

1 MEIIWSAVSCMRAKKLSTHETFRIKICI 

13175 cttgattggggttccatagcaacgttttacgccaccgccaccgctaccgtcaatatgcttactataagcttgtgcaagatcaag 

29 LDWGSIATFYATATATVNMLTI SLCKIK 

13091 tctttcttcgatatgaagtttaccagggcgttcccagtgacacataaaattaattgtagcaacatcaatatttgcagattgttt 

57 SFFDMKFTRAFPVTHKINCSNI NICRLF 

13007 aaatgctga 12999 

B5 K C * 
1820RF027 

14 896 atgaacatgattgtatgttcctctaatatgatcgagctgtgttggatcagacaattcaaagatcgtttctccgtcaaaataaaa 

1 MNMIVCSSNMIELCWIRQFKDRFSVKIK 

14812 cacgcttgtgctacctgttgtaagagtgaaaataaactcatcacttgtggcgtcaaatatttcattttcaggaggaacaatttg 

29 HACATCCKSENKLITCGVKYF I FRRNNL 

14728 atcctgttttttacccacatacagaccccatgtattaccatcgccatagaaaacattcaaatcaagattgccgttgtatcctgg 

57 ILFFTYIQIPCITIAIENIQIKIAVVSW 

14644 Caa 14642 

85 * 
1820RF028 

14430 atgtttataataaaacaggcgttaaagcatggttttatacgtatacagcaaacctcaatacaactgatttttctagtattgcaa 

1 MFIIKQALKHGFIRIQQTSIQLIFLVLQ 

14514 aaggcgattatggtttatgggttgctgaatatggatcaaatcaaccacaaggctactctcaaccagcgccacctaaaacaaata 

29 KAIMVYGLLNMDQINHKATLNQRHLKQI 

14598 attttccaattgttgcctgttttcagtttacaagtaaaggacgtttaccaggatacaacggcaatcttgatttga 14672 

57 I FQLLPVFSLQVKDVYQDTTA I LI * 
182OR7029 

17606 atgaatgaaccgatcgtatacacagaaatttattcaaataacgtggtatgtatgaaaatttttagagatgaggataaacttagt 

1 MNEPIVYTEIYSNNVVCMKI FRDEDKL.S 

17522 aaattcctctatttagaatttgaggtggatgaggctaaaaagttacttgaaaataaaacaatttcatttgatgataactggact 

29 KFLYLEFEVDEAKKLLENKTI SFDDNWT 

17438 ttctcaataaattatccagaatattaa 17412 

57 FSINYPEY* 
182ORF030 

16429 atggctacattctacaaggaaccaatatacgatatcacagtattttatatagatggttgggaggttttgatacacaaaaccgaa 

1 MATFYKEPIYDITVFYIDGWEVLIHKTE 

16345 cctctcaccttaacaaaagcattaaaatatagccgtatatacctagaaatggatatagtgaactgcgttagaatagaaagaaat 

29 PLTLTKALKYSRIYLBMDIVNCVRIERN 

16261 ggacgtcctatagctacattttacagggaattattaaaactgtataaggagaaagaactatga 16199 

57 GRPIATFYRELLKLYKEKEL * 

182ORF031 

8603 atgttacctgaactttcaatctgcccattgttagataagacttctgatgtttgtacacgtgcagtcttatctacgttagcattg 

1 MLPELSICSLLDKTSDVCTRAVLSTLAL 

8519 ttgatacctagaaaagttaacactccactccatacttcgctcaattccgatcgtagtttatctactacatatggagcatttgtt 

29 LIPRKVNTSFHTSFNSDRSLSTTYGAFV 

84 3 S tgccatacattaaaagattcgtcaaactccatatctttatccacaaaaacagcctga 8379 

57 CHTLKDSSNSISLSTKTA* 
182ORF032 

11413 atgtttcatcaaaaacaacttgttccgggttcgtttcagggtgcaataggtaattcttttgattttcttctttcatcttgttca 

1 MFHQKQLVSGSFQGAIGNSFDFLLSSCS 

11329 tttgaatatcaattcgttcttccatatgaacctccttattttagagggaaaacgcaattatctagggatagatattcgatttta 

29 FEYQFVLPYEPPYFRGKTQLSRDRYSIL 

11245 cctacattgtcatttacagttactttgccatcaggtgtgattgatacataa 11195 

57 PTLSFTVTLPSGVIDT* 
182ORF033 

4 942 atgtcaacaaaaatttcttcaatcgttcgacctaaaggcatgtttccttttttaaacattttcaaagggttacgccaagatttg 

1 MSTKISSIVRPKGMFPFLKI FKGLRQDL 

4858 tatcggataactactttaecaatacggtcaactaaagttgaaataaattcgttttttactacgtctaaacgtgtgatccctgca 

29 YRITTLPIRSTKVEINSFFTTSKRVI PA 

4774 ccaaccgcttcgatgttatctgcatttggcataggtacgttcgcctga 4727 - 1_ - ~" ^ 

57 PTASMLSAFG IGTFA* ^ 
182ORF034 ^ 

6160 gtgtttatctactctaaaaactcccccgagttgtgtatccctttgataagaacaatctctattctcgttaagaacaggaaacga 

1 VFIYSKNSPELCI PLIRTISI LVKNRKR 

6076 attaaagtacgattcctgttcctgttgagttttaaaccatcttgtgtgtgtataggtgttatcaaaaggcacgttagccaacaa 

29 IKVRFLFLLSFKPSCVCIGVI KRHVSQQ 

5992 ttttacatttgtacaccttcttgccataattgtcctccttag S951 
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57 FYICIPSCHNCPP* 
182ORF035 

15758 atggcgcataagaaactactatttttacctctcttttcaataaacgtatcactatcattgacaaactcattgctgatactaaaa 

1 MAHKKLLFLLLFS INVSLSLTNSLLILK 

15674 ccttcgtatcccgttccacgaatcaatccaccaaaaggtgcttctctcttcacttctgcaaagtcctttgaatcacacaattca 

29 SSYSVPRINLPKGVSLFTSAKSFESHNS 

15590 atcaatatacctcgatcttga 15570 

57 INIPRS* 
182ORF036 

2315 atgtctgtgctgccttgcattttacaccactcaaaaaaagaatcgatttctaaaccgaacgtcatattgtcaacgttgtctata 

1 MSVLPCILHHSKKESISKPNVILSTLSI 

2231 tcgcatacgccccacgaccatacacgacaatcgttgagatcagttgttgtttcaaagtcgccagtatatttcttaaccataatt 

29 SHTPHDHTRQSLRSVVVSKSPVYFLIII 

2147 cttctcctgtttctgaattaa 2127 

57 LLLFLN* 
182ORF037 

12280 gtgagctacgacaataaacatctacatcaatataagcttgatccacatcttgaaactcaaacaaagcgtttctacttccgtatg 

1 VSYDNKHLHQYKLDPHLBTQTKRFYFRM 

12196 ctagaaaatggttgttgttttggacatgcaagtcatgcgcatatcctcgataataacttacgccaccttcgattgtgttaccag 

29 LENGCCFGHASYAHILDNNLRHLRLCYQ 

12112 aaatttcacagaaattaa 12095 

57 K F H R N * 
X82ORF038 

14769 gtgatgagtttattttcactcttacaacaggtagcacaagcgtgttttattttgacggagaaacgatctttgaattgtctgatc 

1 VMSIiFSLLQQVAQACFILTEKRSLNCLI 

14853 caacacaactcgatcatattagaggaacatacaatcatgttcatggaaaagaaatcccatcaatggtgtggacacctgaacaat 

29 QHNSIILEEHTIMFMEKKSHQWCGHLNN 

14937 ttgatatttacttaa 14951 

57 LIFT* 
182ORF039 

9992 atgttgctgatgatcgaacattttggtataagattcaacgcgacaatactgactatggagccgatcctattgacacgttacgta 

1 MLLMIEHFGIRFNATILIMEPILLTRYV 

10076 ttgttgcaatcaataaagttagtggctggaataccgctacaggagatatttatcttaacattaaaggaacggagggtgtataat 

29 LLQSIKLVAGIPLQEIFILTLKERRVYN 

10160 ggcagacactag 10171 

57 G R H * 
182ORF040 

16202 atgagaaaagatttcgtctacattaacacacccgatccaaaagcaaacaaaaaggcgttagcaaaaatcactaacgccaaagaa 

1 MRKDFVYINTPDPKANKKALAKITNAKE 

16118 ccaaaacaaaactatcgcagactacaattactatgttatctactattcatcattgtaatagaactaatcgtggtagctctacta 

29 PKQNYRRIiQLLCYLLFI IVIELIVVALL 

16034 aaatag 16029 

57 K * 

182ORP041 

3886 atggaactatataaagcaatgtttatcgtacgtgatgaaggtactattgacggttacgatactgaacactatgtagatatttct 
1 MELYKAMFIVRDEGTIDGYDTEHYVDIS 
3970 ttacatgactttgaagaaatatatggaaaagaaacacgtgaaattgaagcagtaacattagtaaaaacaggaaatttaaaaaaa 
29 LHDFEEIYGKETREI EAVTLVKTGNLKK 

4054 taa 4056 

57 * 
182ORF042 

10832 gtgtcctctaaaccgcttactcgaacagctagatcagttgcacttttttgtgctgaatcagctgtgtttttagctgccgttgca 
1 VSSKLLTRTARSVALFCAESAVFLAAVA 
10748 actgcgctgatactgtttgctgttgttttcgcttgttctgcggtttgttgtgctgtaccagccttcgtcaacgcttga 10671 
29 TALILFAVVFACSAVCCAV PAFVNA* 

182ORP043 

10652 gtgtcaatttctgttgacaaaccagcagcagtttctttagcttgttgtgcaagtgttttagctccactagcgctacctcttatt 
1 VSISVDKPAAVSLACCASVLASLALPLI 
10568 tcagcattaactaattcaagattagagtcgcctgttagaacatttttagcaattgtaacaggcattaaacgattatga 10491 
29 SALTNSRLESPVRTFLAIVTGIKRL* 
182ORF044 

6457 atgaaaagttgttacatttgttgttgtgtattctctaccgtaatcaagcgactctccaattgtattaatgaaaggtattccatt 
1 MKSCYICCCVFSTVIKRLFNCINERYSI 
6373 cgaataataaagttgagggtgttcttttgcaatgaaacttcctcgtataccaatatcgaaacgataagtctgtaa 6299 

29 R1IKLRVFFCNETFSYTNIETISL* 
132ORF045 

6729 atgaatggtatacctgtatacgacgttacatacaccccgactatcttatctaaaaaaggttctttcgttgtaagaaacgccatg 
1 MNG I PVYDVTYI PTI LFKKG S FV"VR _J^~ 7V M 

6645 cactctccaaaattagcattgcccgccccatttggtttgtatacctccccacttgaattgacaggaagtaaataa 6571 

29 YSPKLALPAPFGLYTSPLELIGSK* 
182ORF046 

2372 atggtttcaaatggtgtaaagaagcaaaagaagatcgaacattctccacactcatatcaaacatgggtcaatggtatgctttgg 
1 MVSNGVKKQKKIEHSPHSYQIWVNGMLW 
2456 aaatttgttgggaagttaattacacaacaacaaaatcaggtaaaacgaaaaaagagaaatctcgaacaataa 2527 

29 KFVGKLITQQQNQVKRKKRNLEQ* 
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182ORF047 

13353 atgctcccattgttccaacatgtgttactgttccatcgcaacatgcaatcatttcategccagggtgatcaattgaaccaaagt 

1 MLPLFQHVLLFHRNMQSFHCQGDQLNQS 

13269 ccaaaccatcatggaaattatttggtctgccgtttcctgcatgcgtgccaaaaagttgtccactcatga 13201 

29 PNHHGNYLVCRFLHACQKVVHS* 
182ORF048 

3395 atgtcagggtttgttccgaactttccatacaagctattcaacacacctccggcgttagcttttctagccccttcggtggtgttc 

1 MSGFVPNFPYKLFNI PLALAFLAPSVVF 

3311 tttacttcgatccatttatcgatccagcctttgaacatatcacaagaagctttgaacatatatccgtaa 3243 

29 FTSIHLSIQPLNISQEALNIYP* 
182ORF049 

1578 atgttgcaatctcaagagcgcaaatcaaagaagcgcaaattaaaacagagcaagctcaaaaagcgaaagaagaacactacaaag 

1 MLQSQERKSKKRKLKQSKLKKR KKNTTK 

1662 agcttaacaaagttgaagttaagaagcccacagaaaacacaattgtcacaccaactattttaa 1724 

29 SLTKLKLRSPQKTQLSHQLF* 
182ORF050 

8012 atggttatcttggtttctttaaagaccctacacttgggttcatggtttgcgcaggggcagaagatggtcaaatcgatcattatc 

1 MVILVSLKTLHLGSWFAQGQKMVKSIII 

8096 acaaccctattttctttacagcaaacgaagcaatgtatcacaagagatatcctgttttaa 8155 

29 TTLFSLQQTKQCITRDILF* 
182ORF051 

9390 atgcttctgaaaaagaaacaaagaacacagacattaataaagatcaaaatcaaaccaaagatacgattacacgatataaaggta 

1 MLLKKKQRTQTLIKIKIKPKIRLHDIKV 

9474 aaaagggaaacactgattatgctgacttactcgaaaaatatcgtagaagtgttttga 9530 

29 KRETLIMLTYSKNIVEVF* 
182ORF052 

4096 gtgatagttgacaagagtcaaatttggcgagattgggcgaatgtacacgtgaaatatcgtgcgctcccgttaagttatggacac 

1 VIVDKSQIWRDWANVHVKYRALPLSYGH 

4180 ataaacgttttgaccgtcaaccaatcgcaaaaaccttttaggagtagcccttaa 4 233 

29 INVLTVNQSQKPFRSSP* 
182ORF053 

15656 gtggaacagaatacgaagattttagtatcaacaatgagtttgtcaatgatagtgatacgtttattgaaaagagaagtaaaaata 

1 VEQNTKI LVSTMSLSMIVIRLLKREVKI 

15740 gtagtttcttatgcgccattgcttttgaagggaaaatctttgggtattggatag 15793 

29 VVSYAPLLLKGKSLGIG* 



182ORF054 

8136 gtgatacatcgcttcgtttgctgtaaagaaaatagggttgtgataatgatcgatttgaccatcttctgcccctgcgcaaaccat 
1 VIHCFVCCKENRVVIMIDLTIFCPCANH 
8052 gaacccaagtgtagggtctttaaagaaaccaagataaccattagtgtgtaa 8002 

29 EPKCRVFKETKITISV* 
182ORF055 

8324 atgaaaagaaatacttctcattgctacaagcttataaccaaattgacgaaaataatcaggctgtttttgtggataaagatatgg 
1 MKRNTSHCYKLITKLTKIIRLFLWIKIW 
8408 agtttgacgaatcttttaatgtatggcaaacaaatgctccatatgtag 8455 

29 SLTNLLMYGKQMLHM* 
182ORF056 

6549 gtggcccatctcctttttcctattatttacttcctatcaatccaagtggggaggtatacaaaccaaatggggcaggcaatgcta 
1 VAHLLFPI IY FLS IQVGRYTNQMGQAML 

6633 attttggagagtacatggcgtttcttacaacgaaagaaccttttttaa 6680 

29 ILESTWRFLQRKNLF* 
182ORF057 

8264 atgtccgccatatctaaagcaaaacgacgtaaacttggtaacgtaggaactttcaagtcattattatacaacatgatacatttt 
1 MSAISKAKRCKLGNVGTFKSLLYNMIHF 
8180 gatttatcatcatcatcatcatatcttaaaacaggatatctcttgtga 8133 

29 DLSSSSSYLKTGYLL* 
182ORF058 

5176 gtgtattcaaattcgcttacttcgtcacctgcgtataaagcgttcactacaccagcaacgaaactattgaaattatcccatgaa 
1 VYSNSLTSS PVYKAFITPATKLLKLSHE 

5092 gtaaatgcttcttctaaccatgcttcttggaccgtttgtctgtag 5048 

29 VNAFSNHASWIVCL* 
182ORF059 

15876 atggtctttcgtagtcattgcataaaaatgatttgtatttggttgataatcataactcacatagacacaacctgtttcagcgtc 
1 MVFRSHCIKMICIWLIIITHIDTTCFSV 
15792 tatccaatacccaaagattttcccttcaaaagcaatggcgcataa 15748 

29 YPI PKDFPFKSNGA* - Li""' 

182ORF060 _ 

15404 gtgatttttgatttctcaattaaaaactcatcaaacaaaattgtacgaacttcgggatattcattagatttttcaattccccac 

1 VIFDFSIKNSSNKIVRTSGYSLDFSIPK 

15320 gtactaagtggaacagcccaacccattaatttatcatcacaatag 1S276 

29 VLSGTAQPINLSSQ* 

182ORF061 

2102 atgaggggacttctccacctgtttcagactcgatcacttctgcaatcttactgtaaacctgttcttcctcctgttgtacttccg 
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1 MRGLLHLFQTRSLLQSYCXLVLFSVVLL 

2018 cttcgtcataaatgtagtcaaggttcatgcctaagaagttactaa 1974 

29 LRHKCSQGSCLRSY* 
182ORF062 

1992 atgtctaagaagttactaacatatgttttcataaatagatcaagccccattgagtcaagtaaacgaacaatatcgtcagcgtct 

1 MSKKLLTYVFINRSSPIESSKRTISSAS 

1908 gaattgaaaatagtaaacatcgcttgtctgaaattgtcgtaa 1867 

29 ELKIVNIACLKLS * 
182ORF063 

14306 gtgtaccttctaaacccctctcatgcgcaaaatgatacacaccaatctttttacctaaagacaaagcttgttgaaatgctcggt 

1 VYLLNPSHAQ .NDTHQSFYLKTKLVEMLG 

14222 cacaatcagggtttacataacctgttccgcctgttgctttaa 14181 

29 HNQGLHNLFRLLL* 
182ORF064 

7356 atgatgttagtcaaaccaacaaaagggttgttacttgctaaggctgaaaagatcgctcctcctgcactcattgcactgtttccc 

1 MKLVKPTKGLLLAKAEKIAPPVLIALFP 

7272 ataccatgtctgaaagtattgcgaatgttttgctcttga 7234 

29 IPCLKVLRMFCS* 
182ORF065 

3S82 atgaatgctatctgtatcacaataaataatgcgatcaaaacatttttgagcggttgtaatggtagtatatctaccccaagccgt 

1 MNAICITINNAIKTFLSGCNGSISTPSR 

3498 cacaaaactagcaagcggaacataaacaggatctcttaa 3460 

29 HKTSKRNINRIS* 
182ORF066 

4234 atgtggctactcttttttgtgtttcacagaattatgtttcacgtgaaacagtttttatggtataatagaatcaaaaggaggtgg 

1 MWLLFFVFHRIMFHVKQFLWYNRI KRRW 

4318 agattatggaaattaaagaacatgaatcaattttaa 4353 

29 RLWKLKNMNQF* 
182ORF067 

13882 atgatacctgctttagctttaaaactactaaagtcgacttccttgttaaatttggcaattgtaaaacctaacacaaaatcgata 

1 MI PALALKLLKS I SLLNLAIVKPNTKSI 

13798 atcattgcaaccattaaccatataatcaaaccataa 13763 

29 IIATINHIIKP* 

182ORF068 

7267 atgtctgaaagtattgcgaatgttttgctcttgagcaatcaaggagtttttgtttccttgcatgaatgcagaagcatagtcaga 

1 MSESIANVLLLSNQGVFVSLHECRSIV R 

7183 tttaactcctacatcgttaggatcattatcgattaa 7148 

29 FNSYIVRIIID* 
182OR7069 

5027 gtggaacaatgtttttacatcgggaacttcctgtttaaatacccctgtaacagactcgtcagggttgaacttatgttcctgtgc 

1 VEQCFYIGNFLFKYPCNRLVRVELMFLC 

4943 aatgtcaacaaaaatttcttcaatcgttcgacctaa 4906 

29 NVNKNFFNRST* 
182ORF070 

1031 gtgatggttcggctccaccaaaaccagaaacttcgtctgagtaaactagaatatctttcaattctagtacttcgccaattgcgt 

1 VMVRLHQNQKLRLSKLEYLSILVLRQLR 

947 tacgtaacggaattgaagccccgtttgtggcattga 912 

29 YVTELKPRLWH* 
182ORF071 

11741 atggttttgcattatggttgccacaaggcgctcaaagtggtaaaggaattttctttaatgatactcgcaattacaatcgttttg 

1 MVLHYGCHKALKVVKEFSLMILAITIVL 

11825 actttgatttgtttgttcgtaactgtactttaa 11857 

29 TLICLFVTVL* 
182OR7072 

11723 atgtttacattaaatgccgtcattgtttcaaactttaatgtcgtttctcccgatcctaagaaagtaactacaggtacatcacgt 

1 MFTLNAVI VSNFNVVSPDPKKVTTGTSR 

11639 cccaattcaatggtgttagcaaagcgataa 11610 

29 FNSMVLAKR* 
182ORF073 

2876 gtgaagccgcccttgtatgctttacgcaagtctttatcaaaccctaaagacaaaacaggaaaccattgtttgaaagctgatttt 

1 VKPPLYALRKSLSNPKDKIGNHCLKVDF 

2792 ccatgtgtagcttttagccaatctttgtaa 2763 

29 PCVAFSQSL* 
182ORF074 

8923 gtgattgataaattttgtttcaaattccgctcgttttgtttcgtcataaaacggataatcaaaatcaaacaattgttttcggcc 

1 VIDKFCFKFCSFCFVIKRI IKIKQLFSA 

8839 aacttcaatacgttcttttcgagataa 8813 

29 NFNTFFSR* - 

182ORF075 _ 

7463 gtgttacattatctggaatattttcgatatctgccactttacctgccaagaggttcaaaccgtcttctttttcagaaacatagt 

1 VLHYLEYFRYLPLYLPRGSNRFLFQKHS 

7379 tgtttacttgttgtcctgctcccatga 73S3 

29 CLLVVLLP* 

182ORF076 

24 26 atgagtgtggagaatgttcgatcttcttttgcttctttacaccatttgaaaccatttttgaataaccatgaaagcataaactct 
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1 MSVENVRSSFASLHHLKPFLNNHESINS 
2342 ccgtcaaatttttcgttgtggaaataa 2316 

29 PSNFSLWK* 
1B2ORF077 

11858 atgaaggaacgtatgttgttgttgctagaggtagaggggttacatttgaaaattgtctattctctaatatctctcaagcaatta 
1 MKERMLLLLEVEGLHLKIVYSLI SLKQL 

11942 tcaaaacagcttttcccgatgtaa 11965 
29 SKQLFPM* 
182ORF078 

7671 gtgcctacaatatttggttcttttaatttaatgaaattccatgcttttcttgtttgtaagtttggtgtagctactcgattgctc 
1 VPTIPGSFNLMKFHAFLVCKFGVATRLL 
7587 tttgtgccatacattgagaagtaa 7564 

29 FVPYIEK* 
182ORF079 

7488 gcgaaagataagtttgatccaagctgtgttacattatctggaatattttcgatatctgccactttacctgccaagaggttcaaa 
1 VKDKFDPSCVTLSGI FSISATLPAKRFK 

7404 ccgctttctttttcagaaacatag 7381 

29 PFSFSET* 
182ORF080 

4473 gtgtgctatctgctgatgccaaagcctcagttgttgctccgtagtcttctcgcaatgcttcaagatgttctacaatctttgatc 
1 VCYLLMSKLQLLLRSLLAMLQDVLQSLI 
4389 ttgcttcaccgtctgtga 4372 

29 L L H R L * 
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Table 24 



Sequence similarities phage 182 and public databases 



Phage: 1B2 
Database : nr 

Query** aid| 110156 | lan| 182ORF001 Phage 182 ORF| 5966-7780 ( 2 
(604 letters) 

gi|l38124|sp|P07S34jVG9_BPPZA TAIL PROTEIN (LATE PROTEIN GP9) >. 
gij 138123 |spjp0433ljvG9_BPPH2 TAIL PROTEIN {LATE PROTEIN GP9) >. 
gij 1429238 |gnl|PID|el!73412 (X99260) tail protein [Bacteriophag . 
gij 21533 9 (M12456) p9 tail protein [Bacteriophage phi-29] >gi|2. 
gij 1181970 |gnl(PID|e221269 (Z47794) tail protein [Bacteriophage, 
gi (1181968 1 gnl (PID|e221267 (Z47794) tail protein [Bacteriophage, 
gij 2500030 |sp|Q59968|CARA_SULSO CARBAMOYL- PHOSPHATE SYNTHASE SM. 



384 
374 
346 
208 
62 
56 
49 



e-105 
e-103 
3e-94 
8e-53 
8e-09 
6e-07 
8e-0S 



Query= sid| 110157 | lan| 182ORP002 Phage 182 ORF| 2152-3873 | 1 
(573 letters) 



gi| 11884 8 | 3p| P19894 jDPOL_BPM2 DNA POLYMERASE >gi | 76896 | pir| | JQ0 .. . 665 0.0 

gij 1429230 |gnl|PID|ell73404 (X99260) DNA polymerase [Bacterioph. . . 657 0.0 

gi|ll8849|sp|P03680|DPOL_BPPH2 DNA POLYMERASE (EARLY PROTEIN GP . . . 654 0.0 

gi j 118851 | Sp|P06950 j DPOL_BPPZA DNA POLYMERASE (EARLY PROTEIN G P. . . 654 0.0 

gi 1 15732 (X53371) DNA polymerase (AA 1-575) [Bacteriophage phi-29] 651 0.0 

gij 15734 (X53370) DNA polymerase (AA 1-575) [Bacteriophage phi-29] 651 0.0 

gij 1572479 |gnl | PID|e242301 (X96987) DNA polymerase [Bacteriopha. . . 565 e-160 

gijl072656|pir| (S51275 DNA polymerase - phage CP-1 >gi | 836593 | g. . . 301 le-80 

gij 118847) sp|P22374|DPOM ASCIM PROBABLE DNA POLYMERASE >gi|8385... 71 3e-ll 

gij 461962 j Bp j P33537 j DPOM~NEUCR PROBABLE DNA POLYMERASE >gij2833... 65 le-09 

gij461963 j Sp | P33538 |DPOM_NEUIN PROBABLE DNA POLYMERASE >gijl018... 62 le-08 

gi| 1084487 |pir | t S41618 DNA polymerase - slime mold (Physarum po... 61 3e-08 

gi] 2435429 (AF012250) una a signed reading frame (possible DNA po. . . 61 3e-08 

gi|578157tgnl|PIDje246743 (X52106) DNA polymerase [Neurospora i . . . S9 le-07 

gij 2147969jpir| | S72369 probable DNA-polymerase - Gelasinospora ... 58 2e-07 

gij 2147968 jpirj |S62752 probable DNA-polymerase - Gelasinospora ... 58 2e-07 

gi) 3511140 (AF061244) B type DNA polymerase [Agrocybe aegerita] 57 3e-07 

gij 118850 | Sp| P104 79 |DPOL_BPPRD DNA POLYMERASE (PROTEIN PI) >gi [ . . . 56 6e-07 

gijs78144 (X63909) putative DNA-polymerase, B-type [Morchella c... 47 3e-04 

gij 232013 | sp| P30322 |DPOM_AGABT PROBABLE DNA POLYMERASE >gi|3208... 46 6e-04 

Query= sid| 110159 | lan| 182ORF004 Phage 182 ORF| 4626-5954 | 3 
(442 letters) 

gi|l38117|sp|P13849|VG8_BPPH2 MAJOR HEAD PROTEIN (LATE PROTEIN ... 309 2e-83 

gij 138118 jspjp0753ljvG8_BPPZA MAJOR HEAD PROTEIN (LATE PROTEIN ... 305 3e-82 

gij 1429236 |gnl | PlD|ell73410 (X99260) major head protein [Bacter... 300 le-80 

gijll8195B|gnl jpiDje221257 (Z47794) major head protein (Bacteri... 152 6e-36 

Query= sidj 110160 | lan| 182ORF005 Phage 182 ORFj 12651-13700 | 3 
(349 letters) 

gi| 137932 |sp|P15132|VG13_BPPH2 MORPHOGENESIS PROTEIN 1 (LATE PR... 52 8e-06 

gij 1429242 |gnl| PID|ell73416 (X99260) morphogenesis protein [Bac... 48 7e-05 

gi|l37933|sp|P07538|VG13_BPPZA MORPHOGENESIS PROTEIN 1 (LATE PR... 47 2e-04 



Query= 



sidj 110161 | lan|l82ORFO06 Phage 182 ORF| 14995-16026 | 1 
(343 letters) 



gi| 137944 | sp|P11014 |VG16_BPPH2 ENCA PS I DAT I ON PROTEIN (LATE PROT. . . 402 e-111 

gijl37945jspj P07541 | VG16_BPPZA ENCA PS I DAT I ON PROTEIN (LATE PROT... 402 e-111 

gij 1429245 ]gnl| PID|ell73419 (X99260) encapsidat ion protein [Bac... 381 e-lOS 

gij 1181972 jgnl | PIDje221271 (Z47794) encapsidat ion protein [Bact . . . 159 2e-38 



Query* sid| 110162 | lan| 182ORF007 Phage 182 ORF| 7795-8775 | 1 
(326 letters) 
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gi|l429239|gnl|PID|ell73413 (X99260) upper collar protein [Bact... 271 5e-72 

gi j 137915 | sp | P0753 5 | VG10_BPPZA UPPER COLLAR PROTEIN (CONNECTOR ... 256 le-67 

gi|l37914|sp|P04332|VG10_BPPH2 UPPER COLLAR PROTEIN { CONNECTOR ... 256 2e-67 

gij 1181960 |gnl) PIDj e221259 {Z47794) connector protein {Bacterio... 148 6e-35 

Query= sidj 110163 | lan | 182ORF008 Phage 182 ORF| 14105-14983 | 2 
(292 letters) 

gi|4210750|gnljPID|el374037 (AJ132604) LysL protein (Lactococcu. . . 139 2e-32 

gi j 462559 | sp| P34020 | LYC_CLOAB AUTOLYTIC LYSOZYME (1, 4-BETA-N-AC. . . 75 8e-13 

gi|2327014 (U82823) putative lysozyrae (Saccharopolyspora erythr . . . 64 2e-09 

gij 126652 jsp|P25310 | LYCM_STRGL LYSOZYME Ml PRECURSOR (1, 4 -BETA- . . . 60 2e-08 

gi j 127789 | apj P19386 j LYCA_BPCP9 LYSOZYME (ENDOLYSIN) (MURAMIDASE. . . 60 2e-08 

gi j 67761 | pir j jMUBPCP N-acetylmuraraoyl-L-alanine amidase {EC 3.5... 59 3e-08 

gij 4105636 (AF04 9087J lys (Leuconostoc oenos bacteriophage 10MCJ 59 3e-08 

gij 623084 (L024 96) rauramida3e; rauramidase [Bacteriophage LL-HJ 57 le-07 

gi j 127787 | sp| P15 057 |LYCA_BPCP1 LYSOZYME (ENDOLYSIN) (MURAMIDASE. . . 57 2e-07 

gi j 126597 | ap j POO 721 j LYCH_CHASP N, O-DIACETYLMURAMIDASE (LYSOZYME. . . 57 2e-07 

gi j 127788 jspjpi9385 j LYCA_BPCP7 LYSOZYME (ENDOLYSIN) (MURAMIDASE... S7 2e-07 

gij67762|pirj |MUBPC7 N-acetylmuramoyl-L-alanine amidase (EC 3.5... 56 3e-07 

gij3025168|3p|P7642l|YEGX_ECOLI HYPOTHETICAL 32.0 KD PROTEIN IN... 53 2e-06 

gi|4204413 (AF047001) Lys44 [Oenococcus oeni temperate bacterio... 53 3e-06 

gi|2116978|gnl|PID|dl020940 (D88151) cortical f ragment-lytic en... 52 5e-06 

gij 2392844 (AF011378) lysin [Bacteriophage ski] 43 8e-05 

Query= sidj 110164 | lan j 182ORF009 Phage 182 0RF| 8765-9601 | 2 
(278 letters) 

gi| 1429240 |gnl| PID|ell734l4 (X99260) lower collar protein [Bact... 180 le-44 

gi j 137921 J sp| P04333 j VG11_BPPH2 LOWER COLLAR PROTEIN (LATE PROTE. . . 171 5e-42 

gij 215341 (M12456) pll lower collar protein [Bacteriophage phi-29j 98 9e-20 

gij 224162 |prf I j 1011232B protein pll , lower collar [Bacteriophage... 97 le-19 

gi 1 535260 (Z30339) STARP antigen [Plasmodium reichenowi) 50 le-05 

gi)4049753 (AF063866) ORF MSV230 hypothetical protein [Melanopl . . . 49 4e-05 

gij21315S7|pir| (S70306 hypothetical protein YEL0 77c - yeaat (Sa... 48 5e-05 

gijl317B2|sp|P12753|RA50_YEAST DNA REPAIR PROTEIN RAD50 (153 KD... 48 7e-05 

gij2131309|pir| |S70305 hypothetical protein YBL11 3c - yeast (Sa. . . 47 2e-04 

gij 499325 (Z26314) STARP antigen [Plasmodium falciparum] 4 6 3e-04 

gij 3845171 (AE001391) ribosome releasing factor (OO, TP) [Plasm... 46 3e-04 

gij731903|sp|P40434|YIR7_YEAST HYPOTHETICAL 197. S KD PROTEIN IN... 45 5e-04 

gij 1632829 |gnl|PID|e276379 (Y08924) AARP2 protein [Plasmodium f .. . 45 5e-04 

gij 1176490 j Bp |P40889|YJW5_YEAST HYPOTHETICAL 197.6 KD PROTEIN I... 4 5 5e-04 

gijl077300|pir| (S51848 hypothetical protein HRD1054 - yeast (Sa. . . 45 Se-04 

gij 2425143 (AF020407) WimA (Dictyostelium discoideum) 4 5 6e-04 

gijll8196ljgnl|PID|e221260 (Z47794) collar protein [Bacteriopha. . . 45 6e-04 

gij2132657jpirj |S64819 probable membrane protein YLL067c - yeas... 45 8e-04 

gi|213304l|pir j js65341 probable membrane protein YPR204W - yeas... 45 8e-04 

gij 730275 |sp|P39793|PBPA_BACSU PENICILLIN-BINDING PROTEINS 1A/1... 45 8e-04 

Query= sid| 110165 | lan| 182ORF010 Phage 182 ORF 1 1310-2155 1 2 
(281 letters) 

gi|l3S604|sp|P06812|TERM_BPNF DNA TERMINAL PROTEIN >gi | 75815 | pi .. . 69 3e-ll 

gijl572478|gnl|PID|e242334 (X96987) terminal protein [Bacteriop. . . 65 3e-10 

gij 1429231 j gnl j PID j ell73405 (X99260) terminal protein [Bacterio... 64 le-09 

Querya sid| 110166 j lan| 182ORF011 Phage 182 ORF| 9607-10158) 1 
(183 letters) 

gi|l37928|sp| P07S37|VG12_BPPZA PRE -NECK APPENDAGE PROTEIN (LATE... 51 6e-06 

gijl429241|gnl|PID|ell73415 (X99260) pre-neck appendage protein. . . 51 6e-06 

gijl37927|sp| P20345|VG12_BPPH2 PRE -NECK APPENDAGE PROTEIN (LATE... SO le-05 



Query** sid| 110169 1 lan| 182ORF014 Phage 182 ORF[ 13716-14108 | 3 
(130 letters) 

gi|l37936|sp|P11188|VG14_BPPH2 LYSIS PROTEIN (LATE PROTEIN GP14 . . . 97 6e-20 

gij 137938 j spj P07S39 j VG14_BPPZA LYSIS PROTEIN (LATE PROTEIN GP14 . . . 96 Be-20 

gijl429243|gnl|PID|ell73417 (X99260) lysis protein [Bacteriopha... 96 8e-20 

gij 215332 (M14782) lysis protein [Bacteriophage phi-29) 94 5e-19 

Query= sidj 110170 | lan| 182ORF015 Phage 182 ORF| 854-1225 j 2 
(123 letters) 
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gi|l5670 (V01155) reading frame 10 (may be gene 4) [Bacteriopha. . . 70 5e-12 
gi|l38072|sp|P06953|VG5A_BPPZA EARLY PROTEIN GP5A >gi | 75836 | pir . 69 7e-12 

Query= sid| 110174 | lan| 182ORF019 Phage 182 ORF| 4323-4613 | 3 
(96 letters) 

gi|l429235|gnl|PID|ell73409 (X99260) head morphogenesis protein. . . 61 2e-09 
gi|l38111|sp|P13848|VG7_BPPH2 HEAD MORPHOGENESIS PROTEIN (LATE ... 57 3e-08 
gi | 138112 | sp] P07533 | VG7_BPPZA HEAD MORPHOGENESIS PROTEIN (LATE ... 54 le-07 

Queryn sid| 110180 | lan| 182ORF025 Phage 182 ORF| 54 8-814 | 2 
(88 letters) 

gi | 138099 |sp|P06 955 |VG6_BPPZA EARLY PROTEIN GP6 >gi | 75B41 |pir| | . . . 55 7e-08 
gi|l38098|sp|P03685|VG6_BPPH2 EARLY PROTEIN GP6 >gi j 75840 j pir j j .. . 54 2e-07 
gi|1429234|gnl|PID|el!73408 (X99260) gene 6 product [Bacterioph. . . 54 2e-07 
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Table 25 

Homologies between 182 ORFs and proteins in public databases 



Phage: 182 
Database: Swissprot 

Query= aid 1 110156 | lan 1 182ORF0 01 Phage 182 ORF | 5966-7780 | 2 
(604 letters) 

gi 1 138124 | sp| P07534 |VG9_BPPZA TAIL PROTEIN (LATE PROTEIN GP9) 384 e-106 

gij 138123 j Sp|P0433l|VG9_BPPH2 TAIL PROTEIN (LATE PROTEIN GP9) 374 e-103 

gi|2500030|sp|Q59968|CARA_SULSO CARBAMOYL- PHOSPHATE SYNTHASE SM. . . 49 2e-05 

Query- sid| 110157 | lan| 182ORF002 Phage 182 ORF| 2152-3873 1 1 
(573 letters) 

gi|H8848|sp|P19894|DPOL_BPM2 DNA POLYMERASE 665 0.0 

gi|ll8849|spjP03680|DPOL_BPPH2 DNA POLYMERASE (EARLY PROTEIN GP2) 654 0.0 

gi|ll885l|sp|P06950|DPOL_BPPZA DNA POLYMERASE (EARLY PROTEIN GP2) 654 0.0 

gi|ll8847|spjP22374|DPOM_ASCIM PROBABLE DNA POLYMERASE 71 7e-12 

gi| 461962 jsp|P33537|DPOM~NEUCR PROBABLE DNA POLYMERASE 65 3e-10 

gi|461963|sp|P33538|DPOM_NEUIN PROBABLE DNA POLYMERASE 62 3e-09 

gij 118850| sp|P10479 | DPOL~BPPRD DNA POLYMERASE (PROTEIN PI) 56 2e-07 

gi| 232013 |sp|P30322|DPOM_AGABT PROBABLE DNA POLYMERASE 46 2e-04 

gij 118887 |spjP10582|DPOM_MAIZE DNA POLYMERASE (S-l DNA ORF 3) 46 2e-04 

Query= sid| 110159 | lan | 182ORF004 Phage 182 ORF | 4626-5954 | 3 
(442 letters) 

gi|l38117jsp|P13849|VG8_BPPH2 MAJOR HEAD PROTEIN (LATE PROTEIN ... 309 6e-84 

gi j 138118 j sp | P07531 j VG8_BPPZA MAJOR HEAD PROTEIN (LATE PROTEIN ... 305 7e-83 

Query;* aid | 110160 | lan| 182ORF005 Phage 182 ORFj 12651-13700 | 3 
(349 letters) 

gi|l37932|ap|P15132|VG13_BPPH2 MORPHOGENESIS PROTEIN 1 (LATE PR... 52 2e-06 

gi j 137 933 jap | P0 753 8 1 VG13_BPPZA MORPHOGENESIS PROTEIN 1 (LATE PR... 47 6e-05 

Query* sid| 110161 | lan| 182ORF006 Phage 182 ORF| 14995-16026(1 
(343 letters) 

gi | 137945 [ sp | P07541 | VG16_BPPZA ENCAPSIDATION PROTEIN ( LATE PROT . . . 402 e-112 

gij 137944 jsp|P11014|VG16_BPPH2 ENCAPSIDATION PROTEIN (LATE PROT... 402 e-112 

Query* sid| 110162 | lan) 182ORF007 Phage. 182 ORF | 7795-8775 | 1 
(326 letters) 

gi | 137915 | Sp | P07535 | VG10_BPPZA UPPER COLLAR PROTEIN (CONNECTOR ... 256 3e-68 

gij 137914 |sp|P04332|VG10_BPPH2 UPPER COLLAR PROTEIN (CONNECTOR ... 256 5e-68 

Query= sid| 110163 | lan| 182ORF008 Phage 182 ORF| 14105-14983 | 2 
(292 letters) 

gi|462559|sp| P34020 | LYC_CLOAB AUTOLYTIC LYSOZYME (1 , 4 -BETA-N-AC . . . 75 2e-13 

gi|l26652 japj P25310 j LYCM_STRGL LYSOZYME Ml PRECURSOR (1,4 -BETA- . . . 60 5e-09 

gi|l27789|sp|P19386|LYCA_BPCP9 LYSOZYME (ENDOLYSIN) (MURAMIDASE . . . 60 Se-09 

gi j 127787 jap | PIS 0S7 | LYCA^BPCPl LYSOZYME (ENDOLYSIN) (MURAMIDASE... 57 4e-0S 

gi 1 126597 j sp j P00721 j LYCH_CHASP N, O-DIACETYLMURAMIDASE (LYSOZYME... 57 4e-08 

gi j 127788 j sp ( P19385 j LYCA_BPCP7 LYSOZYME (ENDOLYSIN) (MURAMIDASE... 57 5e-08 

gi | 3025168 | 9p| P7642l|YEGX__ECOLI HYPOTHETICAL 32.0 KD PROTEIN IN... 53 5e-07 

Query= 3id| 110164 | lan) 182ORF009 Phage 182 ORF| 8765-9601 | 2 
(278 letters) 

gi| 137921 |sp|P04333|VGll_BPPH2 LOWER COLLAR PROTEIN (LATE PROTE. . . 171 le-42 

gij 131782 |sp|P12753|RA50~YEAST DNA REPAIR PROTEIN RAD50 (153 KD. . . 48 2e-05 

gijll764 90|sp|P40889|YJW5_YEAST HYPOTHETICAL 197.6 KD PROTEIN I... 45 le-04 

gij 731903 |sp|P40434|YIR7_YEAST HYPOTHETICAL 197.5 KD PROTEIN IN... 45 le-04 

gi|730275|sp|P39793|PBPA_BACSU PENICILLIN-BINDING PROTEINS 1A/1... 45 2e-04 

gijll68610|3p|P41696|AZFl_YEAST ASPARAGINE-RICH ZINC FINGER PRO... 44 3e-04 
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gi| 731587 |sp|P38900|YH19_YEAST HYPOTHETICAL 70.1 KD PROTEIN IN . . . 44 3e-04 

Query* sid| 110165 | lan| 182ORF010 Phage 182 ORFj 1310-2155 | 2 
(281 letters) 

gi 1 135604 | sp| P06812 |TERM_BPNF DNA TERMINAL PROTEIN 69 8e-12 

Query*, sid| 110166 | lan | 182ORF011 Phage 182 ORF| 9607-10158 1 1 
(183 letters) 

gi| 137928 |sp|P07537|VG12_BPPZA PRE -NECK APPENDAGE PROTEIN (LATE... 51 2e-06 

gi 1 137927 | spj P20345 | VG12~BPPH2 PRE -NECK APPENDAGE PROTEIN (LATE... 50 3e-06 

Querya sid| 110169 | lan | 1B2ORF014 Phage 182 ORF| 13716-14108 | 3 
(130 letters) 

gi|l37936|sp|P11188|VG14_BPPH2 LYSIS PROTEIN (LATE PROTEIN GP14) 97 2e-20 

gij 137938 |sp|P07539|VG14_BPPZA LYSIS PROTEIN (LATE PROTEIN GP14) 96 2e-20 

Query= sid| 110170 | lan 1 182ORF015 Phage 182 ORFj 854-1225 { 2 
(123 letters) 

gij 138072 |ap|P06 953 |VG5A_BPPZA EARLY PROTEIN GP5A 69 2e-12 

Query= sid| 110174 | lan | 182ORF019 Phage 182 ORF| 4323-4613 | 3 
(96 letters) 

gi|l3811l|sp|P1384B|VG7_BPPH2 HEAD MORPHOGENESIS PROTEIN (LATE ... 57 9e-09 

gi 1 138112 | Bp | P07533 j VG7_BPPZA HEAD MORPHOGENESIS PROTEIN (LATE ... 54 4e-08 

Query* sid| 110180 | lan| 182ORF025 Phage 182 ORF| 548-814 | 2 
(88 letters) 

gi| 138099 |sp|P06955|VG6_BPPZA EARLY PROTEIN GP6 55 2e-08 

gi|l3 809B |sp|P03685|VG6~BPPH2 EARLY PROTEIN GP6 54 Se-08 
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BLAST P 2.0.8 (Jan-05-1999) 



Query* sid| 110156 | Ian | 182ORF001 Phage 182 ORF| 5966-7780(2 
{604 letters) 

>gi | 138124 | sp | P07534 |VG9_BPPZA TAIL PROTEIN (LATE PROTEIN GP9) 
>gi|75849|pir| |WMBP9Z gene 9 protein - phage P2A 
>gi j 216058 (M11813) tail protein [Bacteriophage PZA) 
Length = 599 

Score = 384 bits (975) , Expect = e-105 

Identities = 231/610 (37%), Positives - 344/610 (55%), Gaps = 36/610 (5%) 

Query: 6 TNVKLLANVPFDNTYTHTRWFKTQ£EQ 65 

TNV++LA+VPF N Y +TRWF + Q ++FNS + E ++Q + V 
Sbjct: 9 TNVR I LADVPFSNDYKNTRWFTS S SNQYNWFNS KTRVYEMS KVTFQGFRENKS YI S VS LR 68 

Query: 66 KD ALYACNYL I FKNEETY P S KWQYAFVTD I E YKNDNTS FVTFE I D VLQTYRFD IG I RE S F 125 

D LY +Y++F+N + Y +KW YAFVT++EYKN T++V FEIDVLQT+ F+I +ESF 
Sbjct: 69 LDLLYNAS YIMFQNAD - YGNKWFYAFVTELE YKNVGTTYVHFE IDVLQTWMFNI KFQES F 127 

Query: 126 IAKEHPQLYYSNGIPFINTIEESLDYGREYTTTNVTTFHPNDGVNFLVILTSEAM- - PVG 183 

I +EH +L+ +G P INTI+E L+YG EY +V P D + FLV+++ M G 
Sbjct: 128 IVREHVKLWNDDGTPTINTIDEGIiNYGSEYDIVSVENHRPYDDMMFLVVISKSIMHGTAG 187 

Query: 184 DKEDKSG GSIVGGPSPFSYYLLPINSSGEVYKPN-GAGNANFGEYHAFLT TKEP 236 

+ E +• S+ G P P YY+ P G+V K G NAN + LT +++ 

Sbjct: 188 EAESRLNDINASIiNGMPQPLCYYIHPFYKDGKVPKTFIGDNNANLS 247 

Query: 237 FI^IVGMYVTSYTGIPFIVDHANKTVRYNAGGSYKIMLPTYASDPTGTMKTFAFFCVKE 296 

+N IV MYVT YG+ ++K+++ + + ADG+T VK+ 

Sbjct: 248 AVNN I VNMYVTDY I G LKLD YKNGD KELKLDKDMFEQAG I - - - ADDKHGNVDT IF- - - VKK 301 

Query: 297 ARTFVPKRIDLVGNVY^FREAFPFNVKESKLFMYPYCLIEITDTKGHVMTLRPEYLTGG 356 

+ ID G+ + F + +ESKL MYPYC+ E+TD KG+ M L+ EY+ 

Sbjct: 302 I PD YETLE ID- TGDKWGG FTKD QESKLMMYPYCVTEVTDFKGNHhINLKTEYIDNN 355 

Query: 357 KLS VYVKGSLGI SNKVM I E P ID YD VSNS TI ITNLSDKMLIDNDPNDVGVKSDYASA 412 

KL + V+GSLG+SNKV DY+ S +T D LI+N+PND+- + +DY SA 

Sbjct: 356 KLKIQVTIGSLGVSNKVAYSIQDYNAGGSLSGGDRLTASLDTSLINNNPNDIAIINDYLSA 415 

Query: 413 FMQGNKNSLIAQEQNIRNTFRHGMGNSAMSTGGAIFSAIASNNPFVGLTNIMGAGQQVNN 472 

++QGNKNSL Q+ +1 GM +S G ++ +PF +++ G N 
Sbjct: 416 YLQGNKNSLENQKSSILFNGIVGMLGGGVSAG ASAVGRS PFGLAS S VTGMTSTAGN 4 71 

Query: 473 YVSEKENGIiNI^GKVADIENIPDNVTQLGSNl^FT^N-FQNYYQLRFKQIKYEYATRL 531 

V + + L K ADI NIP +T++G N +F GN ++ Y ++ KQ+K EY L 
Sbjct: 472 AVLD MQALQAKQADIANI PPQLTKMGGNTAFDYGNGYRGVYVI K- KQLKAEYRRSL 526 

Query: 532 DRYFSMYGTKSNRVATPNLQTRKAWNFIKLKEPNIVGTMSNDVLTRVKQIFSAGVTLWHT 591 

+F YG K NRV PNL+TRKA+N+ 1 + K+ I G ++N+ L ++ IF G+TLWHT 
Sbjct: 527 SSFFHKYGYKINRVKKPNLRTRJCAYNYIQTIQDCFISGDINNNDLQEIRTIFDNGITLWHT 586 

Query: 592 NDVLNYNQDN 601 

+D+ NY+ +N 
Sbjct: 587 DDIGNYSVEN 596 



Query* sid 1 110157 | lan 1 182ORF002 Phage 182 ORF| 2152-3873 J 1 
(573 letters) 

>gi|H8848|sp|P19894|DPOL_BPK2 DNA POLYMERASE >gi | 76896 | pir | | JQ0161 
DNA-directed DNA polymerase (EC 2.7.7.7) - phage M2 
>gi J 215509 (M33144) DNA polymerase (Bacteriophage M2} 
Length =572 

Score = 665 bits (1697), Expect =0.0 

Identities = 327/589 (55%) , Positives » 420/589 (70%), Gaps = 38/589 (6%) 

Query: 3 KKYTGDFETITDLNDCRVWSWGVOTIDNVTJNM^ 62 
K ++ DFETTT L+DCRVW++G +1 N+DN G +D F +W M+ D+YFHN KF 
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Sbjct: 4 KMFSCDFETTTKLDDCRVWAYGYMEIGNLDNYKIGNSLDEFMQWV-MEIQADLYFHNLKF 62 

Query: 63 DGEFMLSWLFKNGFKWCKEAKEDRTFSTLISNMGQWYALEICWEVNYXXXXXXXXXXXXX 122 

DG F+++WL ++GFKW E + T++T+IS MGQWY ++IC+ 
Sbjct: 63 DGAFIVTWLEQHGFKWSNEGLPN-TYNTIISKMGQWYMIDICFGYX GKRKL 112 

Query: 123 XXIIVT5SLKKYPFPVKQIAEAFNFPIKKGEIDYTKERPIGYKPTKDEWEYLKNDIQIMAM 182 

+IYDSLKK PFPVK+IA+ F P+ KG+IDY ERP+G++ T +E+EY+KNDI+I+A 
Sbjct: 113 HTVIYDSLKKLPFPVKKIAKDFQLPLLKGDIDYHTERPVGHEITPEEYEYIKNDIEIIAR 172 

Query: 183 ALKIQFDQGLTRMTRGSDALGDYKDWLKATHGKSTFKQWFPILSLGFDKDLRKAYKGGFT 24 2 

AL IQF QGL RMT GSD+L +KD L F + FP LSL DK++RKAY+GGFT 

Sbjct: 173 ALDIQFKQGLDRMTAGSDSLKGFKDILST KKFNKVFPKLSLPMDKEIRKAYRGGFT 228 

Query: 243 VWKVFQGKBIGDGIVFDVNSLYPSQMYVRPLPYGTPLFYEGEYKPNNDYPLYIQNIKVR 302 

W+N ++ KEIG+G+VFDVNSLYPSQMY RPLPYG P+ ++G+Y+ + YPLYIQ 1+ 
Sbjct: 229 WLNOKYKEKEIGEGMVFDVNSLYPSQMYSRPLPYGAPIVFQGKYEKDEQYPLYIQRIRFE 288 

Query: 303 FRLKEGYIPTIQVKQSSLFIQNEYLESSVNKLGVDELIDLTLTNVDLELFFEHYDILEIH 362 

F LKEGYI PTIQ+K++ F NEYL++S GV E ++L LTNVDLEL EHY++ + 
Sbjct: 289 FELKEGYIPTIQIKKNPFFKGNEYLKNS GV-EPVELYLTNVDLELIQEHYELYNVE 343 

Query: 363 YTYGYMFKASCDMFKGWIDKWIEVKNTTE^ARKANAKGMLNSLYGKFGTNPDITGKVPYM 422 

Y G+ F+ +FK +IDKW VK EGA+K AK MLNSLYGKF +NPD+TGKVPY+ 
Sbjct: 344 YIDGFKFTlEKTGLFKDFIDKWTYVKTOEEGAKKQIJUa^ 403 

Query: 423 GEDGIVRLTLGEEELRDPVWPIJ^FVTAWGRYTTITTAQKCFDRIIYCDTDSIHLVGTE 482 

+DG + +G+EE +DPVY P+ F+TAW R+TTIT AQ C+DRIIYCDTDSIHL GTE 
Sbjct: 4 04 KDDGS LGFRVGDE EYKD PVYT PMGVFI TAW AR FTT I T AAQACYDR I I Y CDTD S I HLTGTE 463 

Query: 483 VPEAIDHLVDPKKLGYWGHESTFQRAKFIRQKT YVEEIDGEL 524 

VPE I +VDPKKLGYW HESTF+RAK++RQKT YV+E+DG+L 
Sbjct: 464 VPEIIKDIVDPKKbGYWAHESTFKRAXYlJlQKTYIQDIYVKEVDGKLKECSPDEATTTKF 523 

Query: 525 NVKCAGMPDRIK£IVTFDNFEVGFSSYGKLLPKRTQGGVVLVDTMFTIK 573 

+VKCAGM D IK+ VTFDNF VGFSS GK P + GGWLVD++FTIK 
Sbjct: 524 SVKCAGMTDTIKKKVTFDNFAVGFSSMGKPKPVQVNGGVVLVDSVFTIK 572 



Queryo sid| 110159 | lan | 182ORF004 Phage 182 ORF| 4626-5954 | 3 
{442 letters) 

>gi | 138117) sp|P13849|VG8_BPPH2 MAJOR HEAD PROTEIN {LATE PROTEIN GPS) 
>gi | 75845 j pir | | WMBPB9 gene 8 protein - phage phi- 2 9 
>gi j 215325 (M14782) major head protein [Bacteriophage 
phi-29) >gi| 225362|prf | | 1301270B gene 8 [Bacillus sp.] 
Length = 44 8 



Score = 309 bits {783 J , Expect e 2e-83 

Identities = 176/440 (40%), Positives = 250/440 {56%) , Gaps = 27/440 (6%) 



Query: 


4 


KITEQDVLRATNVETPVQLMTAI YNSSSS LFQANVPMPNADNIEAVGAGI TRLDWKNEF 


63 






+IT DV + + ++ AI NS F++ VP+ A+N+ VGAGI V+N+F 




Sbjct: 


2 


RITFNDVKTSLGITESYDIVNAIRNSQGDNFKSYVPLATANNVAEVGAGILINQTVQNDF 


61 


Query: 


64 


ISTLVDRIGKWIRYKSWRNPLKMFKKGNMPLGRTI EEI FVDI AQEHKFNPDESVTGVFK 


123 






I++LVDRIG WIR S NPLK FKKG +PLGRTIEEI+ DI +E +++ +E+ VF+ 




Sbjct: 


62 


ITSLVDRIGLWI RQVS LNNPLKKFKKGQI PLGRTIEE I YTDITKEKQYDAEEAEQKVFE 


121 


Query : 


124 


QEVPDVKTLFHEINREGYYKQTIQEAWLEKAFTSWDNFNSFVAGVMNALYTGDEVSEFEY 


183 






+E+P+VKTLFHB NR+G+Y QTXQ+ L+ AF SW NF SFV+ ++NA+Y EV E+EY 




Sbjct: 


122 


REMPNVTCTLFHERNRQGFYHQTIQDDSLKTAFVSWGNFESFVSSIINAIYNSAEVDEYEY 


181 


Query: 


184 


TKLLI ANYQEKELFKEI E IGE ITESNA- - KEFIRKI KSTSNKLEFM - - SSAYNAQG VKTS 


239 






KLL+ NY K LF ++I E T S EF++K+++T+ KL S +N+ V+T 




Sbjct: 


1B2 


MKLLVDNYYSKGLFTTVTCIDEPTSSTGALTEFVKKMRAT 


241 


Query: 


240 


TSKSDQYXXXXXXXXXXXXXXXXXXXFNMSKTDFVGHKIVI DEF PKKEGEESSNI VAVI V 


299 






+ D + FNM++TDF+G+ VID F S+ + AV+V 




Sbjct: 


242 


S YMEDLHLI IDADLEAEIJJVDVLAKAFNMNRTDFLGNVTVIDGF ASTGLEAVLV 


295 


Query: 


300 


DSEWFMIYDKLYKTTSLYNPEGLYWNYWLHHHQLYSTSQFGNAVAFVKSATKPVTKVAFA 


359 






D +WFM+YD L+K ++ NP GLYWNY+ H Q S S+F NAVAFV VT+V + 




Sbjct : 


296 


DKDWFMVYDNIiHKMETVRNPRGLYWNYYYHVWQTLS VS RFANAVAFVSGDVPAVTQVT VS 


35S 


Query: 


360 


SATTS WKG S S KD IALTFT P VEATNQQGEWS S AP AL VKAT VKQTAGKATA VTVEGLEVG 


419 
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+V +G + V ATN + V V G +T + G 
Sbjct: 356 PN I AA VKQGGQQQ FT AYVRATNAKDHKV VW S VEGGSTGT A I TG 398 

Query: 4 20 QS L VTFTAI GGQQATVLVTV 439 

L++ + Q TV TV 

Sbjct: 399 DGLLSVSGNEDNQLTVKATV 418 



Query* sidj 110160 | lan| 182ORF00S Phage 182 ORF| 12651-13700 | 3 
(349 letters) 

>gi| 137932 |sp|P15132|VG13_BPPH2 MORPHOGENESIS PROTEIN 1 (LATE 

PROTEIN GP13) >gi | 758S8 | pir | [WMBP23 gene 13 protein - 
phage phi -29 >gi| 215331 (M14782) morphogenesis protein 
{Bacteriophage phi-29] >gi| 225368 jprf| ) 1301270H gene 13 
{Bacteriophage phi-29] 
Length = 365 

Score * 51.5 bits (121) , Expect » 8e-06 

Identities = 44/166 (26%), Positives = 70/166 (41%) , Gaps = 14/166 (8%) 

Query: 6 NEQIARGQTIAKILSKYGYNKNSQVGWANLHWESA GLNPNSNEXXXXXXXXX-QWT 61 

+E Q I LS G+ K + G++ N+ ES GL N +E QWT 

Sbjct: 12 S EMKVNAQ YI LNYL S SNG WTKQA I CGMLGNMQS E ST I NPG LWQNLD EGNTS LG FG L VQ WT 71 

Query: 62 PKSNLYRQAQICGLSNAKAETLEGQAEIIAO^DKTGQWMDNTPVSSAGYTNPQ/TLSAFKQ 121 

P SN A GL ++ II + + QW++ ++ Y K 
Sbjct: 72 PASNYINWANSQGLPYKDMDS - - ELKRI I WEVNNNAQWINLRDMTFKEY IKS 121 

Query: 122 SANIDVATINFMCHWERPGKLHIEERLDLAQAYSKHIDGSGGGGVK 167 

+ + F+ +ERP + ER D A+ + K++ G GGGG++ 

Sbjct: 122 TKTPRELAMIFLASYERPANPNQPERGDQAEYWYKNLSGGGGGGLQ 167 



Querys sid | 110161 | lan| 182ORF006 Phage 182 ORF| 14995-16026 | 1 
(343 letters) 

>gi| 137945 |sp|P0754l|VG16_BPPZA ENCAPS I DAT I ON PROTEIN (LATE PROTEIN 
GP16) >gi | 75861 | pir | (WMBP16 gene 16 protein - phage PZA 
>gi | 216065 (M11813) morphogenesis protein C 
[Bacteriophage PZA] 
Length » 332 

Score = 402 bits (1023) , Expect = e-111 

Identities » 186/332 (56%), Positives = 244/332 (73%), Gaps « 2/332 (0%) 



Query: 


11 


EKNLYYNPNNAIX3FNCLMLFVIGARGIGKTYGYKKFVWRFIKHGEQFIYLRRFKTELKK 


70 






+K+L+YNP L ++ ++ FVIGARGIGK+Y K + +NRFIK+GEQFIY+RR+K EL K 




Sbjct: 


2 


DKSLFYNPQKMLSYDRILNFVIGARGIGKSYAMKVYPINRFIKYGEQFIYA/RRYKPELAK 


61 


Query: 


71 


I PQFFKTMAKEFPDHKLEVKGKEFYCDDKLMGWAVPLSTWG I EKSNEYPEVRT I LFDE FL 


130 






+ +F +A+EFPDH+L VKG+ FY D. KL GWA+PLS W EKSN YP V TI+FDEF+ 




Sbjct: 


62 


VSNYFNDVAQEFPDHELVVKGRRFYIIXSKLAGWAIPLSVWQSEKSNAYPNVSTIVFDEFI 


121 


Query: 


131 


IEKSKITYLPNEAEALLNMMETVFRRRTNTRCVMLSNATSVW 


190 






EK Y+PNE ALLN+M+TVFR R RC+ LSNA SWNPYFL+FNL PD+NKRFN 




Sbjct: 


122 


REKDNSNYIPNEVSAIJjNLMI>TVFRNRERVRCICI*SW 


181 


Query: 


191 


LYQDRGILIELCDSKDFAEVKRETP FGRLIRGTEYEDFS INNEFVNDSDTFI EKRSKNSS 


250 






+Y D LIE+ DS DF+ +R+T FGRLI GTEY + S++N+F+ DS FIEKRSK+S 




Sbjct: 


182 


VYDD- - ALIEI PDSLDFSSERRKTRFGRLIDGTEYGEMSLDNQFIGDSHVFIEKRSKDSK 


239 


Query: 


251 


FIXIAIAFEGKIFGYWIDAETGCVYVSYDYQPNTNHFYA>nTKDHE 


310 






F+ +1 + G G W+D G +YV + P+T + Y +TT D EN +L+ N++NNY+L 




Sbjct: 


240 


FVFSIVYNGFTLGVWVDVNO^LMYVDTAHDP 


299 


Query: 


311 


STVAKAPKNSYLRFDNIVIKNLHYDLFNKMKI 34 2 








+A AF N YLRFDN VI+N+ Y+LF KM+I 




Sbjct: 


300 


RKLASAFMNGYLRFDNQVIRNIAYELFRKMRI 331 





Query- sid | 110162 | lan [ 182ORF0 07 Phage 182 ORF| 7795-8775 | 1 
(326 letters) 

>gi|l429239|emb|CAA67658| (X99260) upper collar protein 
(Bacteriophage B103] 
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Length =» 308 
Score = 271 bits (6B5) , Expect = 6e-72 

Identities ~ 131/275 (47%> , Positives * 187/275 (67%), Gaps * 5/275 (1%) 



Query : 


36 


YYEHYRRQLTLLT FQLFEWENL PKS I D P RYLE I ALHTNG YLG F FKD PTLGFMVCAGAE DG 


95 






+Y HY + L L +QLFEWE LP S+DP YLE ++H GY+GF+KDP +G++ C GA G 




Sbjct : 


22 


WYYHYYQYLCSLAYQLFEWERLPPSVDP SYLEKS IHQFGYVGFYKDPRIGYI ACQGALSG 


81 


Query: 


96 


QIDHYHNPI FFTANEAMYHKRY PVLRYDDDDDKS KCIMLYNNDLKVPTLPS LHRFALDMA 


155 






+DHY+ PFA+ Y ++YD +K+ + +YNNDLK TLP+L FA D+A 




SbjCt: 


82 


TVDHYNLPDRFHASSVGYQNTFKLYNYSDMKEKNMGVAIYNNDLKCSTLPALEMFAQDLA 


141 


Query: 


156 


DINQISRVNRRAQKTPVI IQTDEKKYFS LLQAYNQIDENNQAVFVDKDMEFDES FNVWQT 


215 






++ +1 VN+ AQKTPV+I ++ SL YNQ + N +FV + ++ D + V++T 




Sbjct: 


142 


ELKEIIAVNQNAQKTPVLIAANDNNQLSLKNIYNQYEGNAPVIFVHESLDLD-NLKVFKT 


200 


Query : 


216 


NAPYWDKLRS ELNEVWNEVLTFLGI NNANVDKTARVQTSEVLSNNEQIESSGNI LLKSR 


275 






+APYWDKL ++ N VWNEV+T+LGI NAN++K R+ TSEV SN+EQIESSGNI LK+R 




Sbjct: 


201 


DAPYVVDKLNAQKNAVWNEVMTYliGIKNANLEKKERMVTSEVDSNDEQIESSGNIYLKAR 


260 


Query: 


276 


KEFCDRVNRVFGDELDGKIDVKFRTDAVRQLQLAA 310 








+E C++++ ++G L VKFR D V Q++L A 




Sbjct: 


261 


QEACNKISELYGLNL KVKFRYDIVEQMRLNA 291 




Query* 


sid| 


| 110163 |lan|l82ORF008 Phage 182 ORF| 14105-14983 | 2 





(292 letters) 

>gi|4210750|emb|CAA10710| (AJ132604) LysL protein [Lactococcus 
lactis) 
Length » 235 

Score = 139 bits (347), Expect = 2e-32 

Identities = 85/210 (40%), Positives = 114/210 (53%), Gaps « 14/210 (6%) 

Query: 2 MNGIDISSYQTGIDLSKVPCDFWIKATGGTGYWPDCDRAFQQALSLGKKIGVYHFAHE 61 

MNGIDISSYQ ++ VP DFV IKAT GT Y+NP + Q + K +G YHFA 
Sbjct: 1 MNGIDISSYQAELNAGIVPSDFVIIKATEGThTflNPTWEEQAGQVIQTNKLLGFYHFA 59 

Query: 62 RGLEGTPQQEAQFFLDNIKGYIGKAVLILDFEGS - -NQKDVNWAKAFLDYVYNKTGVKAW 119 

G P EA FF+ +K YIGKAVL+LDFE N A+ FL+ V KTG+ 

Sbjct: 60 VGNPI AEADFF I S VV1CNYI GKAVLVLDFEAG AINAWGNVGARQFLNRVKEKTG I NPM 116 

Query: 120 FYTYTANLNTTDFSS IAKGDYGLWVAEYGSNQPQGYSQPAPPKTNN FPIVACFQF 174 

Y + ++S+I+ + LWVA+Y S P GY + P T+ + A Q+ 

Sbjct: 117 IYMSSDVTRQFNWSTISSTN-PLWVAQYASMNPTGYQ--SEPWTDGKGYGAWSSAAIHQY 173 

Query: 175 TSKGRLPGYNGNLDLNVFYGDGNTWDLYVG 204 

+S G L ++GNLD+N+ Y + N W G 
Sbjct: 174 SSAGSLSNWSGNLDINLAYINANQWKSLAG 203 



Query- sid| 110164 | lanj 182ORF009 Phage 182 ORF| 8765-9601 | 2 
(278 letters) 

>gi| 1429240 |erab|CAA67659| (X99260) lower collar protein 
[Bacteriophage B103] 
Length =293 

Score = 180 bits (451) , Expect » le-44 

Identities a 115/296 (38%), Positives = 161/296 (53%), Gaps * 33/296 (11%) 

Query: 3 LKRYIESFTYYQPELSRKERIEVGRKQLFDFDYPFYDETKRAEFETKFINHFYLREIGSE 62 

L YIE ++ Y+ LS E+IE GR +LFDF YP +DE+ R FET FI +FY+REIG E 
SbjCt: 8 LSTYIEMWSQYETGLSMAEKIEKGRPKLFDFQYPIFDESYRKVFETHFIRNFYMREIGFE 67 

Query: 63 TMGSFKFNLDEYLNLNMPYWNKMFLSNLEEF -PI FDDMDYTIDEKQKLLNE IDTN I KANR 121 

T G FKFNL+ +L +NHPY+NK+F S L ++• P+ + T K+ DT NR 

Sbjct: 68 TEGLFKFNLETWLII^PYFNKLFESELIKYDPLENTRLNTTGNKKN DTERNDNR 122 



Query: 



122 D 
D 



ES KNQTKQVDQTDNRNKKTRDTGTT 
+ K+ TK D+T+ + D TT 



DS FSRNTYTDTPQKDLRI ASNG 169 
D+F+R +D P L + +N 
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33S 

Sbjct: 123 DTTGSMKADGKSbn'KTSDKTNATGSSKEDGKTTGSVTDDNFNRKIDSDQPDSRLNLTTN- 181 

Query: 170 DGTGVINYATNITEDLSKETTSSTGVETNNDKTKQNTRSNAS EKETKNTD 219 

DG G + YA+ I E+ + ++TG TNN ++ + S S T N 

Sbjct: 182 DGOGTLEYASAIEENNTNNKRNTTG- -TNNVTSSAESESTGSGTSDTVTTDNANTTTNDK 239 

Query: 220 INKDQNQTKDTITRYKGKKCNTDYADLLEKYRRSVLRIEKMIFREMNKEGLFLLVY 275 

+N N +D I GK G YA L++ YR ++LRIEK IF EM + LF+LVY 
Sbjct: 240 LNSQINNVEDYIESKIGKSGTQSYASLVQDYRAALLRIEKRIFDEMQE--LFMLVY 293 



Query= sid| 110165 | lan j 182ORF010 Phage 182 ORF| 1310-2155 | 2 
(281 letters) 

>gi|l35604|sp|P06812|TERM_BPNF DNA TERMINAL PROTEIN 

>gi | 75815 jpir | j ERBPNP terminal protein - phage NF 
>gij 579177 |emb|CAA68440| (Y00363) gene E product (AA 
1-267) [Bacteriophage NF] 
Length = 266 

Score » 74.9 bita (181), Expect = 6e-13 

Identities = 73/275 (26%), Positives = 129/275 (46%), Gaps « 37/275 (13%) 

VRISKNDRAKLEKIYGKSNKARKKYNRLRQK-GVE - - - ERQLPTVPTSKKRL I DYVKSTN 5 8 
+RI+ ND+A K+ K+ KA K +R ++K G++ E +LP + + + 
IRITNNDKALYAKLV-KNTKA- -KISRTKKKYG1DLSNEIELPPLESFQ - - 52 



Query: 


3 


Sbjct: 


7 


Query: 


59 


Sbjct: 


53 


Query: 


119 


Sbjct: 


112 


Query: 


171 


Sbjct: 


166 


Query: 


228 


Sbjct: 


225 



+R +FNK 



N+NY F NK 



T++AQ+ +E +E 



-NKVEVKKPTENTIVTPTILTELGADLPFQAI PDFNIDAFTSPEGVQSYLEN 170 

K + I++P+ +T G P DFN D S +++ E 

GGKQQGTVGQRMQILSPSQVT--GISRP SD FNFDDVRS YARLRTLE EG 165 



Y+D R 



+ NF + + FNSD +D++V L + DF+Y+ F + 



++ +Y 



E + +KI 



G+V 



Query° sid| 110166 | lan | 182ORF011 Phage 182 0RF| 9607-10158 | 1 
(183 letters) 

>gi| 1429241 { emb|CAA67660 | (X99260) pre -neck appendage protein 
[Bacteriophage B103] 
Length = 860 

Score « 50.8 bits (119), Expect = 6e-06 

Identities = 29/105 (27%), Positives = 56/105 (52%), Gaps » 6/105 (5%) 

Query: 8 KRFDGLPAVFKERFSKYPHTEYRYELLLDEEVSALIAYLNEVGALVNDMSGYLNYFIEHF 67 

+RF+ L++++YT+ + L E+++ +1 YLN++G L ND+ N +E 
Sbjct: 7 RRFEKLGEMMVQVYERYLPTAFDESMTI^EKMNKIIEYIjNQIGRLTNDVVEEVWKVMEWI 66 

Query: 68 V-EKLEEITNDTLKKWLSDGTLENLINDTVFANYIKEIKRLQILV 111 

+ + LE+ +TL+KW +G +L+ I E+K+ + V 

Sbjct: 67 LNDGLEDYVKETLEKWYEEGKFADLV IQVIDELKQFGVSV 106 



Query- sidj 110169 | lan| 182ORF014 Phage 182 ORP| 13716-14108 | 3 
(130 letters) 

>gi| 137936 |sp|P11188|VG14_BPPH2 LYSIS PROTEIN (LATE PROTEIN GP14) 
>gi | 75860 |pir| |WMBP29 gene 14 protein - phage phi-29 
>gi| 15678 |emb|CAA28631| (X04962) gene 14 product (AA 
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1-393) (Bacteriophage phi -29 j >gi| 225369|prf | | 1301270J 
gene 14 [Bacteriophage phi -29] 
Length = 131 

Score = 96.7 bits (237), Expect = 6e-20 

Identities = 53/131 (40%), Positives = 81/131 (61%), Gaps » 3/131 (2%) 



Query: 


1 


MIEYITQWL-ADDNHLVYGLI IWLMVAMIIDFVLGFTIAKFNKEIDFSSFKAKAGIIVKV 


59 






MI ++ +L D+ L+Y L +LMV M++D VLG AK N I FSSFK K G+++KV 




Sbjct : 


3 


MIAWMQHFI^TDETKLIYWLT-FI^/C>nA^TVI^VLFAKLNPNIKFSSFKIKTGVLI^ 


61 


Query: 


60 


AEMVLWY F I P VAVKFG AVG I TMY I TMLVG LILSEIYSI LGH I SD I DDDNNWTDYVKK FL 


119 






+EM+L + IP AV F A G+ + T+ L +SEIYSI GH+ +DD +++ + ++ F 




SbjCt: 


62 


SEMI LALLAI PFAVPFPA- GLPLLYTVYTALCVSE I YS I FGHLRLVDDKSDFLE I LENFF 


120 


Query: 


120 


DGTLNRKDD I K 130 








T + + K 




Sb j ct : 


121 


KRTSGKNKEEK 131 





Query- sid| 110170 | lan |1820RF015 Phage 1B2 ORF| 854-1225 | 2 
(123 letters) 

>gi|l5670|emb|CAA244 83| (V0115S) reading frame 10 (may be gene 4) 
(Bacteriophage phi-29) 
Length =124 

Score =69.9 bits (168), Expect = 6e-12 

Identities = 39/119 (32%) , Positives = 64/119 (53%) , Gaps = 3/119 (2%) 

Query: 3 IVXSTFDTQTPEGMLQVFNATNGASIPLRNAI-GEVLELKDILVYSDEVSGFGGAEPSQA 61 

IVK+TFDT+T EG +++FNA G +N G ++E I Y +G A+ + 

Sbjct: 6 IVKATFDTETLEGQIKIFNAQTGGGQSFKNLPDGTIIEANAIAQYKQVSDTYGDAK- - EE 63 

Query: 62 ELVAFFTEDGKTYAGVSAVATKSAKNLIDMMTANPDIKPKISFVEGKSNGGQKFVNLQV 120 

+ F DG Y+ +S ++A +LID++T + K+ V+G S+ G F +LQ+ 

Sbjct: 64 TVTTIFAADGSLYSAISKTVAEAASDLIDLVTRHKLETFKVKW^ 122 



Query- sid) 110174 | Xan| 182ORF019 Phage 182 ORF| 4323-4613 |3 
(96 letters) 

>gi|l429235|emb|CAA67654| (X99260) head morphogenesis protein 
[Bacteriophage B103J 
Length a 101 

Score =60.9 bits (145), Expect = le-09 

Identities = 34/96 (3S%) , Positives = 53/96 (54%), Gaps = 5/96 (5%) 

Query: 1 MEIKEHESILNGILESVTDGEARSKIVEHLEALREDYGATTEALTSANSTLEKLKKDNEA 60 

MB HE ILN + + + R++ + L+ LR DYG+ + S EKIh- +N 

Sbjct: 3 MEROSHEEILNKLNDPELEHSERTEL LQQLRADYGSVLSEFSELTSATEKLRAENSD 59 

Query: 61 LVISNSKLFRERAIVEPAEN- -NEPETDQNITLDDL 94 

L++SNSKLFR+ I + E + E + IT++DL 
Sbjct: 60 LIVSNSKLFRQVGITKEKEEEIKQEELSETITIEDL 95 



Query= sid| 110180 | lan| 182ORF025 Phage 182 ORF| 548-814 | 2 
(88 letters) 

>gi| 138099 |sp|P069S 5 |VG6_BPPZA EARLY PROTEIN GP6 

>gi| 75841 |pir] |ERBP6Z gene 6 protein - phage PZA 
>gi 1 216047 (M11813) gene 6 product [Bacteriophage PZA] 
>gi| 224746 |pr£| {1112171K ORF 6 [Bacteriophage PZA] 
Length = 96 

Score =55.0 bits (130), Expect = 8e-08 
Identities = 28/79 (35%), Positives = 45/79 (56%) 



\ 
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340 

Query: 4 KLMQRNVTSTKVE F S E V I VQDGA PT I VPCE P WLTG KLS EE KALSAI KRKN PDKNVWTN 63 

K+MQR +T T V +++++ DG + G LS E+A +KRK + V V + 

Sbjct: 3 KMMQREITKTTVNVAKMVMVIX3EVQVEQ 62 

Query: 64 VSHETALYTMPVDKFIELA 82 

V T +Y +PV+KF+E+A 
Sbjct: 63 VEPNTEVYELPVEKFLEVA 81 
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Table 26 

Secondary structure prediction for ORF 182ORF008 

1 MMNGIDISSY QTGIDLSKVP CDFVNIKATG GTGYVNPDCD RAFQQALSLG KKIGVYHFAH 

CCCCCCCCCC CCCCCCCCCC CCEEEEEECC CCCCCCCCCC HHHHHHHHHC CCCCEEEEEE 
61 ERGLEGTPQQ EAQFFLDNIK GYIGKAVLIL DFEGSNQKDV NWAKAFLDYV YNKTGVKAWF 

CCCCCCCCHH HHHHHHHHHC CCCCEEEEEE CCCCCCCHHH HHHHHHHHHH HCCCCCEEEE 
121 YTYTANLNTT DFSSIAKGDY GLWVAEYGSN QPQGYSQPAP PKTNNFPIVA CFQFTSKGRL 

EEECCCCCCC CCCEECCCCC CEEEEECCCC CCCCCCCCCC CCCCCCCEEE EEEECCCCCC 
181 PGYNGNLDLN VFYGDGNTWD LYVGKKQDQI VPPENKIFDA TSDEFIFTLT TGSTSVFYFD 

CCCCCCCCEE EEECCCCCCE EEECCCCCCC CCCCCCCCCC CCCEEEEEEC CCCCEEEECC 
241 GETIFELSDP TQLDHIRGTY NHVHGKEIPS MVWTPEQFDI YLKMYEKKPV YK 

CCEEEECCCC CCHHHHCCEE CCCCCCEECC CCCCCCCHHH HHHHHCCCCE EC 



Secondary structure prediction for ORF 182ORF014 



1 MIEYITQWLA DDNHLVYGLI IWLMVAMIID 
CCCCEECCCC CCCCHHHHHH HHHHHHHHHH 
61 EMVLWYFIP VAVKFGAVGI TMYITMLVGL 
EEEEEEEECC CEEECCCEEE EEEEEEEEEE 
121 GTLNRKDDIK 
CCCCCCCEEC 



FVLGFTIAKF NKEIDFSSFK AKAGIIVKVA 
HHHHHHHHHC CCCCCHHHHH HHHCEEEEEE 
ILSEIYSILG HISDIDDDNN WTDYVKKFLD 
EEEEEEEECC CCCCCCCCCC CEEEEEEECC 
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Table 27 

Enterococcus accession numbers 242/242 



gi|289575 1 |gb|AF044978. 1 |AF044978 [289575 1 ] 

gi|4803755|dbj|AB026843.1|AB026843 [4803755] 

gi|476900 1 |gb|AF 1 40549 . 1 1 AF 1 40549 [476900 1 ] 

gi|4760901 |gb|AF099088. 1 |AF099088 [476090 1 ] 

gi|4704705 |gb|AF 1 2 1 254 . 1 1 AF 1 2 1 254 [4704705] 

gi|33421 17|gb|AF076604.1|AF076604 [33421 17] 

gi|4688824|emb|AJ132470.1|ESP132470 
[4688824] 

gi|4732085|gb| AF 1 25553 . 1 |AF 1 25553 [4732085] 

gi|4732082|gb|AF125552.1|AF125552 [4732082] 

gi|4732079|gb|AF 1 2555 1 . 1 |AF 1 2555 1 [4732079] 

gi|4732076|gb|AF 1 25550. 1 |AF1 25550 [4732076] 

gi|4732073|gb|AF125548.i|AF125548 [4732073] 

gi|4732070|gb|AF125547.1|AF125547 [4732070] 

gi|4732067|gb|AF125546.1|AF125546 [4732067] 

gi|4732064|gb| AF 1 25545. 1 |AF 1 25545 [4732064] 

gi|4732061|gb|AF125544.1|AF125544 [4732061] 

gi|4704653|gb|AFl 147 15. 1|AF1 14715 [4704653] v 

gi|4704564|gb|AF102550.1|AF102550 [4704564] 

gi|4688827|emb|AJ238249. 1 |EFA238249 
[4688827] 

gi|4 680606|gb[ AF 1 25 1 98. 1 1 AF 1 25 1 98 [4 680606] 

gi|4633279|gb| AF 1 1 7609. 1 1 AF 1 17609 [4633279] 

gi|4633 1 24|gb[ AF 1 1 0 1 30. 1 1 AF 110130 [4633124] 

gi|4590399|gb|AF124258.1|AF124258 [4590399] 

gi|4590336|gb|AF108380.1|AF108380 [4590336] 

gi|4590335|gb|AF108379.1|AF108379 [4590335] 

gi|40 19 1 67|gb|U2 1 300. 1 |CXU2 1 300 [40 19 1 67] 

gi|4545 1 22|gb|AF0778 1 6. 1 |AF0778 1 6 [4545 1 22] 

gi|4433610|gb|AF106614.1|AF106614 [4433610] 

gi|446883 8|emb| AJ 1 3203 9. 1 |EFA 1 32039 
[4468838] 

gi|4468121|emb|AJ132958.1|BPH132958 
[4468121] 

gi|4456104|emb|Y17302.1|EHI17302 [4456104] 
gi[4433 6 1 1 |gb| AF 1 066 15.1 jAF 1 066 1 5 [4433611] 
gi|4433607|gb| AF 1 066 11.1 |AF 1 066 1 1 [4433607] 



gi|4098267|gb|U76614.1|BLU76614 [4098267] 
gi|47019|emb|Y001 16.1|SFAMB1 [47019] 
gi|4 1 58 1 79|emb| AL03 5206. 1 |SC9B5 [41581 79] 
gi|4 1 65458|emb|X79343. 1 |EF 1 6SSPA [4 1 65458] 
gi|4 1 65457|emb|X79342. 1 |EFTRNALA [4 1 65457] 
gi|4 1 65456|emb|X7934 1 . 1 EF23SRNA [4 1 65456] 
gi|41 50978|emb|Y14027. 1 [EFY14027 [4150978] 
gi|4 1 27803 |emb| A J223 161.1 |EFAJ3 1 6 1 [4 1 27803] 
gi|2956685|emb| Y 1 64 13. 1 |EFENTIJO [2956685] 
gi|2665346[emb| Y 1 3922. 1 |EHY 1 3922 [2665346] 
gi|4324675|gb|AF109375.1|AF109375 [4324675] 
gi|4234627|gb|AF061013.1|AF061013 [4234627] 
gi|4234626|gb|AF061012.1|AF061012 [4234626] 
gi|4234625|gb|AF061011.1|AF061011 [4234625] 
gi|4234624|gb|AF061010.1|AF061010 [4234624] 
gi|4234623|gb|AF061009.1|AF061009 [4234623] 
gi|4234622|gb|AF061008. 1 |AF061008 [4234622] 
gi|423462 1 |gb|AF06 1 007. 1 [AF061007 [423462 1 ] 
gi|4234620|gb|AF061006. 1 [AF061006 [4234620] 
gi|4234619|gb|AF061005.1|AF061005 [4234619] 
gi|4234618|gb|AF061004.1|AF061004 [4234618] 
gi|4234617|gb|AF061003.1|AF061003 [4234617] 
gi|4234616|gb|AF061002.1|AF061002 [4234616] 
gi|4234615|gb|AF061001.1|AF061001 [4234615] 
gi|4234614|gb|AF061000.1|AF061000 [4234614] 
gi|3 1 3 8990|gb|AF06024 1 . 1 (AF06024 1 [3 1 38990] 
gi|3138986|gb|AF060240.1|AF060240 [3138986] 
gi|4204535|gb|AF094803.1|AF094803 [4204535] 
gi|4204534|gb|AF094802 . 1 |AF094802 [4204534] 
gi|4204533|gb|AF094801 . 1 |AF094801 [4204533] 
gi|4204532|gb|AF094800. 1 |AF094800 [4204532] 
gi|420453 1 |gb|AF094799. 1 |AF0_94799[4204534> 
gi|4204530|gb|AF094798.i|AF094798 [4204530] 
gi|4204529|gb|AF094797. 1 |AF094797 [4204529] 
gi|4204528|gb|AF094796. 1 [AF094796 [4204528] 
gi|4204527|gb|AF094795.1|AF094795 [4204527] 
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gi|4204526|gb|AF094794.1|AF094794 [4204526] 

gi|4204525|gb|AF094793.1|AF094793 [4204525] 

gi|4204524|gb|AF094792. 1 |AF094792 [4204524] 

gi|4204523 |gb|AF09479 1 . 1 1 AF09479 1 [4204523] 

gi|4204522|gb|AF094790.1|AF094790 [4204522] 

gi|420452 1 |gb|AF094789. 1 |AF094789 [420452 1 ] 

gi|4204520|gb|AF094788.1|AF094788 [4204520] 

gi|4204519|gb|AF094787. 1 |AF094787 [42045 19] 

gi|42045 1 8|gb|AF094786. 1 |AF094786 [42045 1 8] 

gi|4204517|gb|AF094785.1|AF094785 [4204517] 

gi|42045 1 6|gb| AF094784. 1 |AF094784 [42045 1 6] 

gi|4204515|gb|AF094783.1|AF094783 [4204515] 

gi|42045 14|gb|AF094782. 1 |AF094782 [42045 14] 

gi|4204513|gb|AF094781.1|AF094781 [4204513] 

gi|4204512|gb|AF094780.1|AF094780 [4204512] 

gi|3873 1 86|gb|AF034779. 1 |AF034779 [3873 1 86] 

gi|4151367|gb|AF093508.1|AF093508 [4151367] 

gi|2828136|gb|AF039903.1|AF039903 [2828136] 

gi|2828135|gb|AF039902.1|AF039902 [2828135] 

gi|2828134|gb|AF03990i.l|AF039901 [2828134] 

gi|2828133|gb|AF039900.1|AF039900 [2828133] 

gi|2828132|gb|AF039899.1|AF039899 [2828132] 

gi|2828131|gb|AF039898.1|AF039898 [2828131] 

gi|4103866|gb|AF028812.1|AF028812 [4103866] 

gi|4103864|gb|AF028811.1|AF028811 [4103864] 

gi|2605925|gb|AF029727. 1|AF029727 [2605925] 

gi|1402750|gb|U60038.1|EFU60038 [1402750] 

gi| 1 835780|gb|U86375 . 1 |EFU86375 [1 835780] 

gi|383 1 555|gb| AF047608. 1 |AF047608 [383 1555] 

gi|3790617[gb|AF097414.1|AF097414 [3790617] 

gi|3767587|dbj|AB005036.1|AB005036 [3767587] 

gi|3757810|gb|AF042288.1|AF042288 [3757810] 

gi|3747039|gb|AF093509. 1 |AF093509 [3747039] 

gi|3660559|dbj|AB01781 l.l(AB0178i 1 [3660559] 

gi|l 147743|gb|U4221 1.1|EHU4221 1 [1 147743] 

gi|36764 1 2|gb|AF05 1917.1 1 AF05 1 9 1 7 [3 6764 1 2] 

gi|3676164|emb|AJ01 1 1 13.1|EFA01 1 1 13 
[3676164] 

gi|2612869|gb|AF005726.1|AF005726 [2612869] 
gi|2353762|gb|AF016233.1|AF016233 [2353762] 



gi|2 149899|gb|U94707. 1 |EFU94707 [2 149899] 
gi|2149149|gb|U82366.1|LSU82366 [2149149] 
gi| 1 469463 |gb|U495 1 2. 1 [EFU495 12(1 469463] 
gi| 1 244503 |gb|U353 66. 1 [EFU35366 [ 1 244503] 
gi|833854|gb|U26268.1|EFU26268 [833854] 
gi|841200|gb|U18931.1|CPU18931 [841200] 
gi|460079|gb|U00457.1 [U00457 [460079] 
gi|460077|gb|U00456. 1 |U00456 [460077] 
gi|53566 1 |gb|L34675. 1 |INSTRANSPO [53566 1 ] 
gi|302304 1 |gb|AF007787. 1 |AF007787 [302304 1 ] 
gi|431124|gb|L15633.1|TRN916ENT [431124] 
gi|388106|gb|L238O2.1|ENEEBSA [388106] 
gi|3608387|gb|AF071085. 1 |AF071085 [3608387] 
gi|355 1851 |gb| AF076027. 1 |AF076027 [3551851] 
gi|355 1 773|gb|U94770. 1 |SPU94770 [355 1 773] 
gi|355 1 743|gb|U57498. 1 |ECU57498 [355 1 743] 
gi|3243178|gb|AF063010.1|AF063010 [3243178] 
gi|3 1363 16|gb|AF063900. 1 [AF063900 [3 1363 16] 
gi|3540256|gb|AF052459.1|AF052459 [3540256] 
gi|75 52 1 5 |gb|U 1 7696. 1 |LLU 1 7696 [755215] 
gi|342 1437|gb|AF082295. 1 |AF082295 [342 1437] 
gi|3421436|gb|AF082294.1|AF082294 [3421436] 
gi|3421435|gb|AF082293.1|AF082293 [3421435] 
gi|342 1 434|gb|AF082292. 1 [AF082292 [342 1434] 
gi|334 1 4 30|emb| Y 1 7797. 1 |EF Y 1 7797 [334 1 430] 
gi|3319647|emb|X69092.1|EHPBP3RA [3319647] 
gi|3292886|emb|AJ007584.1|EFA7584 [3292886] 
gi|3261536|emb|AL021958.1|MTV041 [3261536] 
gi|3250708|emb|Z95 1 50. 1 |MTCY 1 64 [3250708] 
gi|3249688|gb|AF070678. 1 [AF070678 [3249688] 
gi|3249687|gb| AF070677. 1 [AF070677 [3249687] 
gi|3249686|gb| AF070676. 1 1 AF070676 [3249686] 
gi|3219158|dbj|AB015233.1|AB015233 [3219158] 
gi|2765275|emb|Y12924.l|SPY12924 [2765275] 
gi|3183687|emb|Y11621.1|EA16SRRN [3183687] 
gi|2765274|emb|Y12923.1|EFYl2923 [27652745 - 
gi|2765273|emb|Y12922.1|ESY12922 [2765273] 
gi|2765272|emb| Y 1 292 1 . 1 |ES Yl 292 1 [2765272] 
gi|2765271|emb|Y12920.1|EDY12920 [2765271] 
gi|2765270|emb|Y12919.1|ESY12919 [2765270] 
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gi|2765269|emb| Y 1 29 1 8. 1 |ECY 12918 [2765269] 
gi|2765268|emb|Y 1 29 1 7. 1 |ECY 1 29 1 7 [2765268] 
gi]2765267|emb| Y 12916.1 |EP Y 12916 [2765267] 
gi|2765266|emb| Y 1 29 1 5. 1 |ES Y 1 29 1 5 [2765266] 
gi|2765265|emb|Y12914.1|ERY12914 [2765265] 
gii2765264|emb|Y12913.1|EMY12913 [2765264] 
gi|2765263|emb|Y12912.1|EHY12912 [2765263] 
gi|2765262|emb|Y1291 1.1|EMY12911 [2765262] 
gi|2765261|emb|Y12910.1|EGY12910 [2765261] 
gi|2765260|emb|Y12909.1|EDY12909 [2765260] 
gi|2765259|emb|Y12908.l|ECY12908 [2765259] 
gi|2765258|emb|Y12907.1|EAY12907 [2765258] 
gi|2765257|emb|Y12906.1|EFY12906 [2765257] 
gi|2765256|emb|Y12905.1|EFY12905 [2765256] 
gi|289454 1 |emb|A J223332. 1 |EFA J3332 [289454 1 ] 
gi|2894539|emblAJ22333 1 . 1 |EFA J333 1 [2894539] 
gi|3 1 08058|gb| AF06088 1 . 1 [AF06088 1 [3 1 08058] 
gi|3087776|emb|AJ223633.1|EFAJ3633 [3087776] 
gi|3080754|gb|AF016483.1|AF016483 [3080754] 
gi|21971 19|gb|AF003921.1|AF003921 [21971 19] 
gi|29 82722|dbj | ABO 12213.1 |AB0 1 22 1 3 [2982722] 
gi|2982721 |dbj| ABO 1 22 1 2. 1 1 ABO 1 22 1 2 [2982721] 
gi|2058780|gb|B07890. 1 [B07890 [2058780] 
gi|2058779|gb|B07889. 1 |B07889 [2058779] 
gi|2058778|gb|B07888. 1 (B07888 [2058778] 
gi|2058777|gb|B07887.1 |B07887 [2058777] 
gi|2058776|gb|B07886. 1 |B07886 [2058776] 
gi|2058775tgb|B07885.1|B07885 [2058775] 
gi|2058774|gb|B07884. 1 |B07884 [2058774] 
gi|2058773|gb|B07873. 1 |B07873 [2058773] 
gi|2058772|gb|B07872.1|B07872 [2058772] 
gi|2058771|gb|B07871.1|B07871 [2058771] 
gi|2058770|gb|B07870. 1 [B07870 [2058770] 
gi|2058769|gb|B07869. 1 |B07869 [2058769] 
gi|2058768|gb|B07868. 1 |B07868 [2058768] 
gi|2058767|gb|B07867. 1 |B07867 [2058767] 
gi|2058766|gb|B07866.1|B07866 [2058766] 
gi|2058765|gb|B07865.1|B07865 [2058765] 
gi|2058764|gb|B07864. 1 |B07864 [2058764] 
gi|2058763|gb|B07883.1|B07883 [2058763] 



gi|2058762|gb|B07882.1|B07882 [2058762] 

gi|2058761|gb|B07881.1|B07881 [2058761] 

gi|2058760|gb|B07880.1|B07880 [2058760] 

gi|2058759|gb|B07879. 1 |B07879 [2058759] 

gi|2058758|gb|BO7878.1|B07878 [2058758] 

gi|2058757|gb|B07877.1|B07877 [2058757] 

gi|2058756|gb|B07876. 1 |B07876 [2058756] 

gi|2058755|gb|B07875.1|B07875 [2058755] 

gi|2058754|gb|B07874. 1 |B07874 [2058754] 

gi|2058753|gb|B07863 . 1 |B07863 [2058753] 

gi|2058752|gb|B07862.1|B07862 [2058752] 

gi|205875 1 |gb|B0786 1 . 1 |B0786 1 [205875 1 ] 

gi|2058750|gb|B07860. 1 [B07860 [2058750] 

gi|2058749|gb|B07859.1|B07859 [2058749] 

gi|2058748|gb|B07858.1|B07858 [2058748] 

gil2058747|gb|B07857.1|B07857 [2058747] 

gi|2058746|gb|B07856.1|B07856 [2058746] 

gi|2058745|gb|B07855.1|B07855 [2058745] 

gi|2058744|gb|B07854. 1 |B07854 [2058744] 

gi|2058743|gb|B07853.1|B07853 [2058743] 

gi|2058742|gb|B07852.1|B07852 [2058742] 

gi|2058741|gb|B07851.1|B07851 [2058741] 

gi|2058740|gb|B0785O.l|B07850 [2058740] 

gi|2947527|gb|T25933. MT25933 [2947527] 

gi|2924302|emb|X8 1 655. 1 [EHERMAM [2924302] 

gi|2664256|emb|Y12234.1|EFAS48C [2664256} 

gi|2879906|dbj|D85752. 1 |D85752 [2879906] 

gi|2746216|gb|AF028836.1|AF028836 [2746216] 

gi|2745825|gb|AF039 139.1 |AF039 1 39 [2745825] 

gi|2696019|dbj|AB007844.i|AB007844 [2696019] 

gi|48999|emb|X62280.1|EHPBP5G [48999] 

gi|2654477|gb|U89914.1|BFU89914 [2654477] 

gi|43347|emb|X68646.1|EHPSRAA [43347] 

gi|2613034|gb[AH005624.1|SEG_EDDH4RR 
[2613034] 

gil2613033|gb|AF029775.1|EDDH4RR2[2613Ct33] 

gi|2613032|gb|AF029774.1|EDDH4RRl [2613032] 

gi|2613031|gb|AH005623.1|SEG_EDDHIRR 
[2613031] 

gi|2613030|gb|AF029773.1|EDDHIRR2 [2613030] 
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gi|2613029|gb|AF029772.1|EDDHIRRl [2613029] 

gi|2613028|gb|AH005622.1|SEG_EDH19RR 
[2613028] 

gi|2613027|gb|AF02977U|EDH19RR2 [2613027] 

gi|2613026|gb|AF029770.1|EDH19RRl [2613026] 

gi|2613025|gb|AH005621.1|SEG_EDISRR 
[2613025] 

gi|26 1 3024|gb|AF029769. 1 |EDISRR2 [26 1 3024] 

gi|2613023|gb|AF029768.1|EDISRRl [2613023] 

gi|1881226|dbj|AB001488.1|AB001488 [1881226] 

gi|2547 1 60|gb| AF023 1 04. 1 1 AF023 1 04 [2547 1 60] 

gi|2547 1 59|gb| AF023 1 03 . 1 1 AF023 1 03 [2547 1 59] 

gi|2547158|gb|AF023102.1|AF023l02 [2547158] 

gi|2547 1 57|gb|AF023 101.1 |AF023 101 [2547 1 57] 

gi|24 1 53 83|gb|AF0 1 5775. 1 |AF0 1 5775 [24 1 5383] 

gi|2388636|gb|U94356.1|EFU94356 [2388636] 

gi|2388634|gb|U94355.1|ECU94355 [2388634] 

gi|2340825|dbj|D26045. 1 [D26045 [2340825] 

gi|2226li7|emb|Y14080.1|BSY14080 [2226147] 

gi|2327026|gb|U87997. 1 [EFU87997 [2327026] 

gi|2318058tgb|AF012532.1|AF012532 [2318058] 

gi| 1 848 175|emb|X87 1 89. 1 |EM23S5SSP [ 1 848 1 75] 

gi|1848174|emb|X87187.1|EM16S23SS [1848174] 

gi|1848173|emb|X87188.1|EM16S23SP [1848173] 

gi| 1 848 1 72|emb|X87 1 85. 1 |EH23S5SSP [ 1 848 1 72] 

gi|1848171|emb|X87184.1|EH16S23SS [1848171] 

gi| 1 848 170|emb|X87 181.1 |EF23S5SSP [ 1 848 1 70] 

gi| 1 848 1 69)emb|X87 1 83. 1 |EF23S5SPA [ 1 848 1 69] 

gi|1848168|emb|X87191.1|EF23S5SAC [1848168] 

gi|1848167|emb|X87180.1|EF16S23SS [1848167] 

gi|1848166|emb|X87182.1|EF16S23SP [1848166] 

gi|1848165|emb|X87190.1|EF16S23SC [1848165] 

gi|1848164|emb|X87I86.1|EF16S23SA [1848164] 

gi| 1 848 1 56|emb|X87 179.1 |ED23S5SSP [ 1 848 1 56] 

gi|1848155|emb|X87178.1|ED16S23SS [1848155] 

gi|1848154|emb|X87177.1|ED16S23SA [1848154] 

gi|2274942|emb|AJ000346. 1 (EHNAPBC [2274942] 

gi|2274939|emb|AJ000042. 1 |EFGLS24B 
[2274939] 

gi|4 1 4575|gb|L 127 1 0. 1 |ENEAAC [4 1 4575] 
gi|2245603|gb|AF006008.1|AF006008 [2245603] 



gi|223 1992|gb|U94530. 1 |EFU94530 [223 1992] 

gi|223 1990|gb|U94529. 1 |EFU94529 [223 1990] 

gi|2231988|gb|U94528.1|EFU94528 [2231988] 

gi|223 1986|gb|U94527. 1 (EFU94527 [223 1 986] 

gi|2231984|gb|U94526.i|EFU94526 [2231984] 

gi|223 1982|gb|U94525. 1 |ECU94525 [223 1 982] 

gi|223 1980|gb|U94524. 1 |ECU94524 [223 1 980] 

gi|2231978|gb|U94523.1|ECU94523 [2231978] 

gi|223 1976|gb[U94522. 1 |ECU94522 [223 1976] 

gi|223 1 974|gb|U9452 1 . 1 |ECU9452 1 [223 1 974] 

gi|2 196685|gb|U25090. 1 |EFU25090 [2 1 96685] 

gi|2197120|gb|AF003922.1|AF003922 [2197120] 

gi|2 1 96683|gb|U25095. 1 |EFU25095 [2 1 96683] 

gi|2 19668 1 |gb|U25094. 1 |EFU25094 [2 1 9668 1 ] 

gi|2 1 96679|gb|U25093. 1 |EFU25093 [2 1 96679] 

gi|2 1 96677|gb|U25092. 1 |EFU25092 [2 1 96677] 

gi|2 1 96675 |gb|U2509 1 . 1 |EFU2509 1 [2 1 96675] 

gi|2196673|gb|U24682.1|EFU24682 [2196673] 

gi|532533|gb|U09422. 1 |EFU09422 [532533] 

gi|487271|dbj|D17462.1|ENENTP [487271] 

gi|468459|dbj|D28859.1|ENEPPDl [468459} 

gi|440135|dbj|D16334.1|ENEATPK [440135] 

gi|39 1 680|dbj|D 13816.1 |ENENAABS [39 1 680] 

gi| 1402524|dbj|D78257. 1 |D78257 [1402524] 

gi|709995|dbj|D30808.1|BACYCB20 [709995] 

gi|2109265|gb|U91527.i|EFU91527 [2109265] 

gi|1041112|dbj|D78016.1|ENEPPDlA [1041112] 

gi| 1 339880|dbj|D85392. 1 |ENERPA [ 1 339880] 

gi| 1 339878|dbj|D85393. 1 |ENEGE 1 E [1339878] 

gi|662918|emb|Z46807.1|EHCOPAYZ [662918] 

gi|769796|emb|X86 1 76. 1 |EFRPODDNE [769796] 

gi|1854638|gb|U51479.1|EGU51479 [1854638] 

gi| 1 85722 1 |gb|U72706. 1 (EFU72706 [ 1 85722 1 ] 

gi| 1 8572 1 9|gb|U72704. 1 |EFU72704 [ 1 8572 1 9] 

gi| 1 8572 1 7|gb|U72705. 1 [ECU72705 [ 1 8572 1 7] 

gi| 1 272655|emb|X96978. 1 [EFPPD 1 GNS £1 2?2655] 

gi| 1 272652|emb|X96976. 1 |EFPLSEP I G [ 1 272652] 

gi|l279406|emb|X96977.1|EFPADlORF 
[1279406] 

gi| 1 070 1 49 |emb|X932 11.1 [EFTNFO 1 [ 1 070 1 49] 
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gi|1065723|embiX92947.1|EFTETMGN [1065723] 

gi| 10 1 9639|gb|L38972. 1 |PH4COINJN [101 9639] 

gi| 1 1 5 1 1 5 1 |gb|U43087. 1 |EFU43087 [1151151] 

gi| 1 098507|gb|U 1 7283. 1 |BMU 1 7283 [ 1 098507] 

gi|1498072|gb|U64887.1|EFU64887 [1498072] 

gi|1498071|gb|U64886.1|EFU64886 [1498071] 

gi|1469783)gb|U58049.1|EHU58049 [1469783] 

gi|1763666|gb|U81452.1|EFU81452 [1763666] 

gi|624694|gb|L38973.1|PH4SEQ [624694] 

gi|1730458|emb|Z83305. 1|EFVANRES [1730458] 

gi|1419498|emb|X84796.1|ECPFW4 [1419498] 

gi| 14 19497|emb|X84795 . 1 [ECPFW3 [141 9497] 

gi| 14 1 9496|emb|X84794. 1 |ECPFW 1 [ 1 4 1 9496] 

gi|254400|gb|S43266. 1 |S43266 [254400] 

gi|239025|gb|S66277.1|S66277 [239025] 

gi| 1 05493 1 |gb[U38590. 1 |EFU38590 [105493 1 ] 

gi|1244573|gb|U39788.1|EHU39788 [1244573] 

gi| 1 24457 1 |gb|U39789. 1 1 EGU3 9789 [ 1 24457 1 ] 

gi|1244569|gb|U39790.1|EFU39790 [1244569] 

gi|1255020|gb|U39777.1|ESU39777 [1255020] 

gi| 1 2550 1 8|gb|U39775. 1 (EPU39775 [ 1 2550 1 8] 

gi| 1 2550 1 6|gb|U39778. 1 |EDU39778 [1255016] 

gi|1255014|gb|U39776.1|ECU39776 [1255014] 

gi| 1 2550 1 2|gb|U39774. 1 |EAU39774 [ 1 2550 12] 

gi| 1 6 1 9922|gb|U69267. 1 |IVU69267 [161 9922] 

gi|790436|emb|X84861 .1 [EFEFMPBP5 [790436] 

gi|790434|emb|X84858.1|EFD63RPSR [790434] 

gi|790432|emb|X84862. 1 |EF72 1 PBP5 [790432] 

gii790430|emb|X84860.1 [EF63RPBP5 [790430] 

gi|790428|emb|X84859. 1 |EF366PBP5 [790428] 

gi| 1 572800|gb|U70854. 1 (CELF38A5 [ 1 572800] 

gi|I041816|gbfU17153.1|EFU17153 [1041816] 

gi| 1 086523|gb|U39859. 1 |EFU39859 [ 1 086523} 

gi(4035 64|gb|U0 1917.1 |EFU0 1917 [403564] 

gi| 15 15474|gb|U66286. 1 [EFU66286 [151 5474] 

gi|1513068|gb|U15554.1|LMU15554 [1513068] 

gi| 1296520|emb|X94 181.1 (EFENTAORF 
[1296520] 

gi| 1488069|gb|U63997. 1 |EFU63997 [ 1488069] 
gi|1209525|gb|U35369.1|EFU35369 [1209525] 



gi| 1 46934 1 |gb|U3093 1 . 1 |ESU3093 1 [ 1 46934 1 ] 
gi|48833 1 |gb|M77276. 1 |S YNGIP2 1 22 [48833 1 ] 
gi|1046177|gb|U39733.1| [1046177] 
gi| 1 2366 1 3|gb|U49939. 1 [CVU49939 [ 1 2366 1 3] 
gi|4749 1 |emb|X55766. 1 |SS 1 6SR5G [4749 1 ] 
gi|47490|emb|X55767. 1 |SS 1 6SR3G [47490] 
gi|47061|emb|X56353.1|SFTET916 [47061] 
gi|49022|emb|X62755.1|SFNPRG [49022] 
gi|47047|emb|X17214.1|SFPASAl [47047] 
giK7044|emb|X68847.1|SFNOXAA [47044} 
gi|47033|emb|V01547.1|SFKANR [47033] 
gi|470l8|emb|X02027.1|SF5SRNA [47018] 
gi|511044|emb|X75752.1|MP16SRNA0 [511044] 
gi|5 1 1 043|erab|X7575 1 . 1 |MP1 6SR243 [5 1 1 043] 
giI886481|emb|X82819.1|ESPLPAM [886481] 
gi|5 17387|emb|X76177. 1 |ES 16SRR [517387] 
gi|472916|emb|X76913.1|EHNTPOP [472916] 
gi|43351|emb|X55133.1|ES16SRRN [43351] 
gi|1143442|emb|X92687.1|EFPBP5G [1143442] 
gi|963032|emb|Z50854. l|EHARPQTOU [963032] 
gi|886479|emb|X848 1 8.1 (EHDNAPSR [886479] 
gi|551437|emb|X81654.1|EHIS1216 [551437] 
gi|467805|emb|X78425.1|EFPBP5 [467805] 
gi|296721|emb|X55961.1|EFPD78 [296721] 
gi|287946|emb|Z 19137.1 |EFPTSHGN [287946] 
gi|49042|emb|X63285.1|EHNAKA [49042] 
gi|49019|emb|X62658.1|EFSEAl [49019] 
gi|43337|emb|Z12296.1|EFSPREG [43337] 
gi|43335|emb|X56895.1|EFPVANAG [43335] 
gi|43333|emb|X16421.1|EFPF54 [43333] 
gi|4333 1 |emb|X62657. 1 |EFORF3 [4333 1 ] 
gi| 106572 1 |emb|X92945. 1 [EFCAT50 1 [ 1 06572 1 ] 
gi|806551|emb|Z49243.1|EF41 10SOD [806551] 
gi|806549|emb|Z49244. 1 |EF4 105SOD [806549] 
gi|505530|emb|X79542.1|EFAS48 [505530] 
gi|43323|emb|X62656.1|EFASPh[43322Ji 
gi|40840|emb|X56422. 1 |EC 1 6SRNAG [40840] 
gi|48 1 89Iemb|X043 88. 1 |TN 1 545TR [48 1 89] 
gi|9288 1 4[gb|L4084 1 . 1 [ENETRANSPO [9288 1 4] 
gi|141856|gb|L01794.1|AD!REPABC [141856] 
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gi| 1 49 1 25|gb|M90647. 1 |IP8 VANY [ 149 1 25 ] 
gi| 1 4 1 862|gb|M87836. 1 1 AD 1TRAE 1 [ 1 4 1 862] 
gi| 1 4 1 860|gb|M84374 . 1 |AD 1TRAA [ 14 1 860] 
gi| 1 4 1 853|gb|M62888. 1 |AD 1 PAD 1 [ 1 4 1 853] 
gi|1101637|dbj|D31674.1|EVM16RNA7 [1101637] 
gi| 1 1 0 1 636|dbj|D3 1 675. 1 |ENE 1 6RNA8 [ 1 1 0 1 636] 
gi|497792|dbj|D3 1 676. 1 |ENC 1 6RNA9 [497792] 
gi| 1 022729|gb|U36 1 95 . 1 |EFU36 1 95 [ 1 022729] 
gi|488338|gb|M77279. 1|S YNGIP3 1 24 [488338] 
gi|488335|gb|M77278. 1 |SYNGEP2563 [488335} 
gi|488333|gb|M77277 . 1 |S YNGIP2 1 24 [488333] 
gi|488329|gb|M77275 . 1 |S YNGIP2 1 2 1 [488329] 
gi|388267|gb|L19532.1|ADlTRAC [388267] 
gi|4930 1 6|gb|U03756. 1|EFU03756 [4930 1 6] 
gi|453536|gb|L28754. 1 [INSTRAN [453536] 
gi| 1 53658|gb|M58002. 1 (STRHYDROLA [ 1 53658] 
gi|475427|gb|U0068 1 . 1 |EFU0068 1 [475427] 
gi|8 1 8704|gb|U24692. 1 |EFU24692 [8 1 8704] 
gi|155036|gb|M97297.1|TRNVAN [155036] 
gi| 1 50552|gb|M64978. 1 |PCFPRGAB [ 1 50552] 
gi|786274|gbIU2254 1 . 1 |EHU2254 1 [786274] 
gi|786273|gb|U22540.1|EHU22540 [786273] 
gi|559858|gb|L37U0.1|ADlCLYL [559858] 
gi|6436 1 4|gb|U 1 6659. 1 (ECU 1 6659 [643614] 
gi|6436 12|gb|U 1 6658. 1 |ECU 1 665S [6436 1 2] 
gi|290641|gb|L13292.1|ENECOPPUMP [290641] 
gi|62470 1 |gb|L29639. 1 |ENEVANCRF [62470 1 ] 
gi|624699|gb|L29638.1|ENEVANCR [624699] 
gi|624692|gb|L2964 1 . 1 |ENEDDLA [624692] 
gi|624690|gb|L29640. 1 |ENEDDL [624690] 
gi|493094|gb|L32813.1|ENERRD [493094] 



gi| 1 53852|gb|AH000939. 1 |SEG_STRTN9 1 6 
[153852] 

gi| 1 5385 1 |gb|M22645. 1 |STRTN9 1 62 [ 1 5385 1 ] 
gi| 1 53850|gb|M20864. 1 |STRTN9 1 6 1 [ 153850] 
gi| 1 53660|gb|M36878. 1 [STRIF2B A [1 53660] 
gi| 1 53585|gb|M 13771.1 |STRBRP [ 1 53585] 
gi| 1 53575|gb|M64265. 1 |STRATPEFHA [ 1 53575] 
gi| 1 53565|gb|M90060. 1 |STRATPASEA [ 1 53565] 
gi| 1 52969|gb|M92376. 1 |STABLAIA [ 1 52969] 
gi|309660|gb|L 14285. 1 [PCFPRGW2Y [309660] 
gi|433714|gb|L12033.1|ENESATA [433714] 
gi|290645|gb|L 1 5304. 1 |ENEVANB2A [290645] 
gi|148331|gb|M84146.1|ENEVAKR [148331] 
gi|148329|gb|M64304.1|ENEVANH [148329] 
gi| 148326|gb|M689 1 0. 1 [ENEVANCRES [148326] 
gi|148324|gb|M75132.1|ENEVANC [148324] 
gi|148323|gb|L06138.1|ENEVANB [148323] 
gi| 14832 1 |gb|M85225. 1 (ENETETM [ 14832 1] 
gi| 1 48320|gb|L00925. 1 |ENERTRNA [148320] 
gi|148319|gb|L00924.1|ENERRNA [148319] 
gi|148317|gb|M81466.1|ENERECA [148317] 
gi|148315|gb|M81961.1|ENENAPA [148315] 
gi|148312|gb|M38386.I|ENEMSPDPS [148312] 
gi|148310|gb|M37185.1|ENEGELE [148310] 
gi|148307|gb|L07892.1|ENEBLACREG [148307] 
gi|148305|gb|M60253.1|ENEBELAA [148305] 
gi|148303|gb|M77639.1[ENEB14NAM [148303] 
gi|290644|gb|L 1 65 1 5. 1 (ENERGTG [290644] 
gi| 1 54954|gb|M37 1 84. 1 |TRN9 1 6 [154954] 
gi|148301|gb|M69221.1|ENEAAD9A [148301] 
gi|148308|gb|M38052.1|ENECYLB [148308] 
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Table 28 

Phage Dpi complete genome sequence. 56506 nucleotides. 

1 ataataaaaa tatgaagcag atattgggtt aattattgct taacaaaatg caccgaattt gtgtataata 

71 taagtgaagc agttttgtaa acctgacatc ctgctaaata aaaataaagg aggctcgaac atgagtcaaa 

141 acactacacg cactgacgcc gaactgacag gcgttactct tttaggaaac caagacacca aatacgatta 

211 tgactataat ccagacgtcc ttgaaacttt ccctaacaaa catcctgaaa ataattacct agtaacattt 

281 gacggatatg aattcacttc cctttgccct aaaacaggac agcctgactt cgcgaatgtt ttcattagtt 

351 acattccaaa cgaaaagatg gttgaatcta aatcattgaa attgtactta ttcagtttcc gtaaccacgg 

421 tgacttccac gaagattgca tgaacactat tttgaatgac ttgtatgaat tgatggaacc taagtacatt 

4 91 gaagtcatgg gcctattcac tcctcgtggt ggaatttcaa tttacccatt cgtcaacaaa gtgaatcctc 

561 aatttgcaac tcctgaactt gaacagcttc aacttcaacg caaattgaac ttccttggaa atgttcaagg 

631 tcttggacga gctattcgat aggaggctgg aatgaaatca gtagttttat tatccggcgg agtcgactca 

701 gccacttgtt tagcaattga agttgacaag tggggttcta aaaatgttca tgctatagca ttcaattacg 

771 gacaaaagca tgaagcagaa cttgaaaatg ctgctaatgt tgcaatgttc tacggagtca agttcaccat 

641 tcttgaaatt gactcgaaaa tctactcaag ctctagctct tccttattac aaggaaaagg cgaaatttca 

911 catggaaaat cttacgctga aatcctagca gagaaggaag tagttgacac ctatgttcca tttagaaatg 

981 gactaatgct ttcacaggct gcggcttatg cttattcggt tggagcttct tacgtcgtat atggtgctca 

1051 cgcagacgat gcggctggag gtgcttaccc tgattgcact cctgagttct ataattcaat gtcaaatgca 

1121 atggaatatg gaactggagg caaggtaacc cttgtcgctc ctctacttac tctaaccaag gcgcaagtcg 

1191 ttaaatgggg aattgattta gatgttcctt atttcttgac tcgttcatgt tatgaaagtg acgctgaaag 

1261 ttgtggaact tgcgcaactt gtatcgaccg caaaaaggca ttcgaagaaa atggaatgac tgaccctatt 

1331 cattataagg agaattgata tgagagtttc taaaacctta acattcgacg cagct caeca actagttgga 

1401 cattttggaa aatgcgcaaa tttgcacggg catacttaca aagtcgaaat ttcattagca ggeggaaett 

1471 atgaccaegg ttcgagtcaa gggatggttg ttgactttta teaegtcaag aaaatcgcag gtacattcat 

1541 tgacagactt gaccacgctg ttcttcttca agggaatgaa ccaatcgctt tagcaaatgc agttgacacc 

1611 aagcgagttc tatttggatt tagaactacg gctgagaata tgtcaagatt ccttacctgg actctcacgg 

1681 agcttatgtg gaagcatget cgtatcgact ctatcaaact atgggaaact cctacaggtt gcgcagaatg 

1751 tacttactac gagattttca cagaagacga gattgaaatg ctcaagaacg taacctttat cgacaaagac 

1821 gaaaagatta ctgtccgcga aattttagag caggagcagg ataatggtta atcaatacaa teagectgaa 

1891 agaggcaaga ttcgaatcaa tgttcgcgac cctgagaaaa tgectatcat ggaaattttc ggtcctacaa 

1961 ttcaaggtga aggaatggtt ataggtcaaa agactatttt cattcgaact ggtggatgcg actatcattg 

2031 caactggtgt gactcagcct ttacctggaa eggtactact gagceggaat atatcacagg caaagaagct 

2101 gctagtcgaa tcttgaaact agctttcaat gataaaggtg aacagatttg taaccacgtg acattgactg 

2171 gaggaaatcc tgecttaate aacgagecta tggctaagat gatttcgatt ctaaaagaac atggattcaa 

2241 gtttggtctc gaaactcaag gaactcgatt ccaagaatgg ttcaaagaag taagegatat cactattagt 

2311 cctaaaccgc cttcaagtgg aatgagaact aatatgaaaa ttcttgaagc tattgtagat agaatgaatg 

2381 acgaaaacct tgactggtca tttaaaatcg ctatctttga cgaaaatgac ctagcttatg cgcgtgatat 

2451 gtttaaaact ttcgaaggca agttacgtcc agtgaactac ctttcagttg ggaatgcaaa cgcatacgaa 

2521 gaaggaaaaa tcagtgatag gcttcttgaa aagttgggat ggctttggga taaagtgtat gaagacccag 

2591 ctttcaacaa tgttcgacct ttaccgcaac ttcatacact tgtttatgat aataaaagag gagtataaaa 

2661 tgaaaattga gcatctagat aaaatcggta aegtattagg gagagagaac ggatgggctt cccttaagcc 

2731 ggatgaaatt gtaaccttgg acaatactga ggcagccgtt caaagacttt ttggtctatt aggegaggae 

2801 geagaaegtg aegggttgea agatactcca ttccgttttg ttaaagcact cgctgaacat acegtagggt 

2871 accgagaaga ccctaaactt catctcgaaa aaacatccga cgtcgaccat gaagaccttg ttcttgtgaa 

2941 agacattcca ttcaattctt tatgtgagca tcatttagct ccgttcgtag ggaaggtgca tattgeatae 

3011 actcccaagg ataagattac aggtctttca aaattcggtc gagtggttga aggatacget aaacgacttc 

3081 aagtacaaga gegcttgact caacaaatcg ctgaegctat tcaggaagtt ctaaatcctc aagcagttgc 

3151 ggtcatcgta gaggctgagc atacttgeat gageggaege ggtattaaga ageaegggge aacgacagtg 

3221 acttcaacta tgegaggtet tttccaagat gaegcatctg ctcgagcaga attgettcag ttgattaaaa 

3291 agtaggaggc ggaaaatgaa taaaagtgca accttttggc Ctgttcgaac agctcttatc geggctctat 

3361 atgtgacatt gaeegttgea ttttctgeta ttagttatgg acctattcaa tttagageca gtgaagcett 

3431 gattcttcta cctttatgga accatagatg gaetcegggg attgtattag gaacaatcat tgeaaactte 

3501 ttttcacctc ttggactgat tgacgtttta tccggttcac ttgetacett ccttggagta gtggcaatgg 

3571 tgaaagttgc taagatggca agtcctctat attcacttat ctgtccagtt ettgetaatg cttaccttat 

3641 tgcgctggaa cttcgaatag tttactcttt acctttttgg gaatctgtca tctatgtagg aattagtgaa 

3711 gegattateg ttttaatttc atacttcctt atttccacgc tggcgaagaa caatcatttt agaacactga 

3781 taggagegaa aaatgggatt taatctatac ttegcaggag gtcaegctat tagcactgac gattatttga 

3851 aggaaagagg agccaatcgc ctattcaatc aactgtacga aagaaacggg attggcaaaa ggtggattga 

3921 gcacaagaaa accaacccaa gcactacttc aaaactattc gtcgactcta gegcatatte tgctcatacc 

3991 aaaggggctg aagttgacat tgacgectat atcgaatacg tgaatgataa cgtgggaatg tttgactgta 

4061 tcgccgaact cgataaaatt cctggtgtat ttagacagee taagacacgt gaacagcttt tggaagcacc 

4131 acaaatttct tgggataatt atctatacat gegegagega atggttgaga aagacaagct cttacctatt 

4201 ttccatatgg gagaagactt taaatggctc aacttgatgc tcgaaactac attcgaaggc gg~aaagcata~ "* 

4271 ttccttacat tggaatctca ccagccaatg actcgactac gaagcataaa gacaagtgga tggaaagagt 

4341 attcgaagtt attcgaaaca gttctaatcc agacgttaag actcacgcat ttgggatgac agttactagc 

4411 caattagagc gtcacccatt ctatagcgcc gactctactt ctgtactgct cacaggagcg atgggaaaca 

4481 ttatgacgee aaaaggatta gttgacttgt cacagaagaa tggaggaatt gatgetgtec gtaggctgee 

4551 aaaaccggtt caagtcgaaa ttgaatccat tatcgaagaa actggagege attttagect agagcaatta 

4621 gttgaggact ataaacttcg agcattgttc aatgttcaat acaegctgaa ttgggcagag aactatgaat 

4 691 tcaagggaat taaaaatcgt caacgtcgac tattctagat aagagctttt cgctcttatc ttttttaaaa 
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4761 aaaaatgaac tttttataca aaaacgcttg 

4 831 aacgaataag aggtaaataa aatgacagca 

4901 actttctaaa agatgttgag tacagtgaca 

4971 tggcgctcat agagatgagc acgatatgaa 

5041 tgaaaaaagt tcaaacttat caagaatatc 

5111 tcgagaagga aaaataggag tcgatgaagc 

5181 gaggaacctc ctttcattgt actcaaaatg 

5251 atatgcttaa aagatttaaa attatttaga 

5321 atatcaaaaa aaggaggctc atattatgag 

5391 cagctcaata agttgaagcc tagcaagttg 

5461 aatgcgtcat gtttacagcg tatgatggct 

5531 tgacgtgatt gtgaaagcag agcagtttgg 

5601 gttcctgaag aatcttcgct aaaagttatt 

5671 aagagtaccc tacattcgac c act t get eg 

5741 gctgttctac ggaatcgeca atatcaacga 

5811 ggcttcctgt taaaaggegg aaaagcaatt 

5831 aaaagggact agaaatgetc attccttaca 

5951 gtacttctgg caaattgacg atactactgt 

6021 atggaaggta tggaagatta tgaagacgtt 

6091 tccctacagc agaaatcctg agegtattag 

6161 cgccgaattc ttattcttga aagaccgact 

6231 tacgeatctg ctggcaagaa agtttcgaag 

6301 aaattgtatc aaccgtcacc gaagaaaact 

6371 atcgaatggt gtcgtttact tcctagcact 

6441 gaattgcaaa gatggttaga gcaggaaaca 

6511 ggttattgaa cgaactcagc ctgaatataa 

6581 attcgaaaaa tgtatttcga aagaatcggt 

6651 tgggegaage tggaacattt aggcacgaag 

6721 ggactctgaa tggttgaatg tagcagagtt 

6791 cgtttcaaga aaaacgatta tgaaacgaag 

6861 gactagttcg atataaaggc aagctctaca 

6931 aca tact gag ccctatgaag aacacaagat 

7001 gtcattttcc cttatgaaaa tcgagataac 

7071 tgaaaaatca agtccttgga aaaattatga 

7141 etattgetet teagectatt geccatattg 

7211 tgttcgagga agactttttc gaaggtgcaa 

7281 taccactaat ggatttcgag gagttgcaaa 

7351 tttattgaac tgaaaactac taaagaagct 

7421 agctatcacg cgcagatgga tgcaaattta 

7491 gattatatgg tatccaattt caagecttga 

7561 ttcatcgatg cagggtatga agtttcttac 

7631 ttctagatgc agttgagctt cattacaagg 

7701 ttcgagaaga agaaatacga gatgetcaag 

7771 cgacgaaatt gttgaagcag ettgeggtte 

7841 caaaatcctg tcattatgga agaccttaac 

7911 cagatagggc ggaaatggtg ggaatacaaa 

7981 tctatacatt ttagccgccg ggaaaactat 

8051 gaagaagtca tegaaaatge ttacaagega 

8121 aggtactagc atctttaaaa cgaattcaaa 

8191 aaaaggagta ttattaaatg caaaaagacg 

8261 atacacaggt gattgggttg atgtacgaat 

8331 tcaagatgtc gaaaagtgct tcaaaaggct 

8401 cacaeggatt tgctcttgaa cttcctaagg 

8471 gaaaactggt ctaatcttcg tttctagegg 

8541 ttctcagttt ggtatgetae tegtgacgea 

8611 aggaaaagca acctgctatc aagttcaatt 

8681 aagtacaggt gatttctaat gaaattggaa 

8751 tagcagctca aggacttgaa cgtgaagcgc 

8821 aacctacggc gggctccctc gaaaaagggt 

8891 tcagctctcg acattgtcaa gaatgcgcaa 

8961 tcaaggaaaa gctggaaaat gcgcgtgcat 

9031 actcgatagt cttcaagagc ctcttaagat 

9101 gctaaaaaga ttggagtcga tgttgacaat 

9171 tacttcaata tgttttagac attttcgaaa 

9241 catggtcagt caaaacctta ttgatgaaga 

9311 actgaattta gtcgaaaggt tactcctctt 

9381 ttcgagaaga tatgaatagt cagcacaatg 

94S1 tgcagttcga cttaaattta gaaaaggtga 

9521 cgaaaccctg cagggaatgt agtagagtca 

9591 tagtttccta tacgctttcc tatcatgatg 

9661 atttggagtc attcaaaagg caggggcatg 

9731 gatgaagacg aagaaccatt gaagttccaa 

9801 acttattcga catggtgatg actgeggtte 

9871 ctatttggac ctaagctagt gectgetagt 

9941 aaatcgatga gcaagtggtt gagcttatga 

10011 ttattatttt aatgactcaa ttatagcaga 
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actctattca ctcattatcg tataatcata atataaataa 
gttcaacaag ttaagttcta cttagaagaa gccggcgctc 
acttagagca agcaattatg aaagatattc ttaaatggaa 
aataacttca tacgaagtat tatagagagg ggtaaggcta 
taaaactagt tgagttcaaa cgtcaacttt ctttaaatct 
ggttattcaa ttattcacct tctatagtct caacaatatc 
caagaggctg ccgtgaacgg gacttatgaa gcaaaactca 
aacggcttta caaactcgcg ataattegtg tatattatat 
tattaagttc aaaaccgaag aactttcaaa aattgtttct 
ctagaaatca caaactattg gcatattttt ggtgacggcg 
caaacttcct tegatgeatt ategacageg atgttgaaat 
aaaacttgta gaaaagacca cggccgcaac cgtcacatta 
gggaatggtg agtacaatat tgatattgtt acagaagatg 
aagacgtgag tgaagaaaat gctctcactt tgaaaagctc 
ttctgcggta tctaaatcag gagcagatgg aatttatacc 
actacagaca teattcgegt atgtatcaac cctatcaagg 
acctaatgag tattttagca agtattcctg atgagaagat 
ctatatttca teggcttcag tcgaaattta tggaaaattg 
tcacagcttg actcaattga gtttgaagat gatgeggcta 
accgccttgt actattcact teagectttg acaaaggaac 
tcgaattaaa acttctacta gcagttatga agacatcatg 
aaagaattca cttgccacct taacagctca ctcttgaagg 
tcactgtctc ttatggaagc gaaaccgcaa ttaagatttc 
teaagagecg gaagaataat ggccaagtcc aatttaacta 
gtgaaggtcc tgettcatet tttgtcaatt cgctgacccg 
tccttcgaca tattataagc ecageggggt tggtggatgt 
gagtctatta tagataaege agattctaac etaattgeaa 
ttctccaaga gtacatggtt aaaatggctg aaatcgatga 
cttgaaagaa aatccagttg aaggaactat cgtcgacgag 
tgtaagaacg aacttcttca actttcattc ttgtgtgacg 
ttttagagat taagactgaa accatgttca agttcactaa 
gcaagcaact tgctacggaa tgtgtctagg agtcgatgat 
ttcgaaaaga aagcctacac gtttcacatc acagacgaga 
ectgegaaga gtatgtagag aaaggcgaaa gtcctaaaat 
tagaaaggaa ggtcgaaatc tgtgagctat actggaaaaa 
aagactttga gaaagatget ttcacggtcc gtctatatga 
tccctgcgat tatatagecg caactaactt tgggaccttg 
tctttgagct ttaataacat cactgataat caatggttcc 
ttctcgccgg aattttagtg tatttccaaa agcatgaaaa 
aaaaattaaa eggtctggag ttaaaagcgt caacccaaac 
aagaagcgtc gaactagatt gaccattcct ttccaaaatg 
agaaaagcaa tggcaagacc taagttacct caaattgata 
aegtagcaga ctcgtatggt gegattatea ataaagtagt 
acttgaccag gcaatggaag aaattcaaat agttgtaagc 
tactacattg gctatcttcc cactcttctt tatttcgecg 
tggattcaag ttctgetate aggaaagaaa aatacgataa 
tcctgacaag caagcagaaa ctcgaaaact tgtcatgaat 
gectacaaga aagttcaatt aaagctagaa caggecgata 
cctggcaact agcagagtta gaaactcagt caaataattc 
tagacgtgaa aatgattgac cctaaacttg accgattaaa 
tagttctatc actaaaattg acgccgacag cgccgatgtc 
caagtatatt cagtggcggc aggtgaatgc attaaaattg 
gatacgaagc aatcttgeat cctcgttcca gtctttttaa 
agtgattgac gaaggttaca aaggtgacac tgatgaatgg 
gatatcttct acgaccaaag aattgcccaa tttagaattc 
tegtagaate tttaggaaat gcggctcgtg gaggecatgg 
cagttgatga aggactggaa taaggattcg aaagctcttg 
ttccaagaat ccctttttct gcgccttcta tgaattatca 
agttgaattc ttcggtcctg agtcaagtgg gaaaactact 
atggtatttg agcaggaatg ggaacagaag actgaagaac 
ccaaagctag caagactget gtcaaggaac ttgaaatgea 
tgtatatctt gaccttgaga atacattaga cactgagtgg 
atttggatag ttcgccctga aatgaacagc gctgaagaaa 
caggtgaagt tggcctagta gttctagatt ccttgcctta 
gttgactaaa aaggcctatg caggaatctc agegectttg 
cttactcgct acaatgeaat attcctaggc atcaatcaaa 
cctattcaac tecaggegga aagatgtgga ageatgettg 
ctaccttgac gaaaacggtg catcattgac ccgtactgct 
ttegtcgaga agaccaaagc atttaagecg gacagaaaat- — 
gaattcaaat tgaaaatgac cttgtagatg tcgctgfccga 
gttcagtatc gtcgaccttg aaactggaga aattatgaca 
ggcaaggcaa atctagttcg aegcttcaag gaggatgact 
acgaaattat cactcgagaa gaaggctaat gcaaaaatct 
teaaggegea agaaaagaac ggttccaaaa cctaaaccta 
acegcagaga gcgtcaagtg cttgttcata gttgeatcta 
egggcagtat gacaaatgga gccacgaact atattctctt 
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10081 atagtttcgc accctgatga 

10151 ctggaatggg tcttccatac 

10221 ttagcttcta aataccgtcc 

10291 tgaaccaatt acaaaatggc 

10361 cactactgct cgaattttcg 

10431 tctaataatg gggtagaaaa 

10501 tcaaagttta catcattgac 

10571 agaagagccc tcatcgggaa 

10641 ctcagtcgag ttcaacggtt 

10711 ttaccgaaag tgaaaatgaa 

10781 actcgcaaat ggaggaatgc 

10651 gacatggaag ccgtttctaa 

10921 ttgccaacta tgacggctca 

10991 attagtgact cgaaacttta 

11061 accactcaac ttcctgctca 

11131 tattgtggat gctagaagaa 

11201 aattgaaacc aaacttcttt 

11271 catttcgaaa tggaaacaac 

11341 ttaatccgtt atattgcttc 

11411 gaaacatcat tcaggatgca 

11481 aatgtcagct cttaactcgc 

11551 gttgatagca tcaataatgc 

11621 ctaatgaaga gaaaatgcag 

11691 gattgtagac tattgcaatc 

11761 gagctatttg aaaaggttac 

11831 ttactaattg gctcaaattt 

11901 tttaaattgg tcgacagttg 

11971 gaccttttag tgagggaagc 

12041 gcgtgaacga atttatcagg 

12111 ccaataatct aaagccgttc 

12181 aatgggaaat gtagttcgag 

12251 tctaatcatc gaatattcgc 

12321 ttccggatgt tagatatggg 

12391 ggcctttcct gataattgtg 

12461 aaatactcga ctattgatag 

12531 ttgacaatga attggacaag 

12601 gcacaagacc gaaattgaca 

12671 atgaaagtga ctgaactttt 

12741 ttaataacgc ttgtcttgtg 

12811 aatcaataag attgtctata 

12881 caagctatcg agggcataaa 

12951 ttttttcact tacttaacaa 

13021 tgacagaagt tgcggtaaat 

13091 atatttaaaa aggaagtacg 

13161 tgacagactt taaaaaacgc 

13231 tatggattgg ctcgaaaatg 

13301 gaaggtggac ttgtcgagca 

13371 gcaaaggctg ggaagacatt 

13441 agttggtcag tatcgtgaaa 

13511 tatgaatacg accctgagca 

13581 ttcaactcac gccagttgaa 

13651 aaatttgaat ggatgtggag 

13721 gccgcaactt atgtagtcga 

13791 ttgaagaagt agttgaagaa 

13861 tgaagaggct gaagaaaaac 

13931 gtagaagagc ctaaagaaga 

14 001 tcgaagaggt agaaagcgca 

14071 tggatatgtt cgagatgtct 

14141 gagcctgacg atgacagcga 

14211 aagaagactt cttctacgaa 

14 281 atacgacgaa gaaacttggg 

14351 gttgcaaaac ctactcgaaa 

14421 aaatgtgtga aaattgtcaa 

14491 cgacgcctca ttcacttaca 

14561 aaagaccgtg acagcctttt 

14631 agagactttg tattgcaaat 

14701 aaaggctgaa gatttaaagg 

14771 ttgcaattag ttgaatcagg 

14 841 cggcactcat ttagaagtag 

14911 caggggtata ttgatacccc 

14981 tcaatggtga cgctattgct 

15051 tgaaactatt aaatacgagg 

15121 aactecgcga agactatcaa 

15191 agaactcgaa aaccttgaag 

15261 cgaaatgtct tgaggctatg 

15331 accatacatg ggttcaccaa 



gtttcgacag actgttctct ataacgagtt taaacagttt gacggaaata 
gactgtcagt ttgctgtaag ggtcgcagaa aggcttttaa gaaaatgaat 
tcaaactctc gaggaagtgg tagctcaaga atatgtcaaa gaaattcttt 
gctatcaaac acggctatct attctgtggt ggcgctggaa ctggtaaaac 
cgaaggatgt gaacaaagga cttggctctc ctattgaaat cgatgctgct 
tgttcgaaac attattgaag attctagata caagtctatg gacagcgagt 
gaggttcata tgctttcaac cggagcattt aatgcgctgt tgaaaacatt 
ccgtgttcat tctatgtact actgaccctc aaaagattcc tgacactatt 
tgactttact cgaattgata atgacgacat cgttaatcaa cttcaattta 
gaaggagctg gttatagtta tgagcgtgac gccctttcgt ttattgggaa 
gtgacagtat cacaaggctc gaaaaagtcc ttgattatag tcatcacgtt 
tgcactagga gttccggact acgaaacatt cgcttcactt gttgaagcta 
aagtgtttag aaattgtaaa tgacttccac tactcaggaa aagacttgaa 
cagacttcct tttagaggtt tgtaagtatt ggctagttcg agatatttca 
ttttgaaagt aagctagagc aattctgtga ggcttttcaa tatcctactc 
atgaatgaac ttgctggagt tgttaaatgg gagcctaatg ctaaaccgat 
tgatgagcaa ggaggagtga catgattgga cagggacttg ttaaatctac 
ctccaaaata tataatcgtc gaaggtgaag taggttcagg acggaagacc 
gaaaettgac gctgattcta ttgtagtagg aacgagtgta gatgacattc 
cagactattt tcaaggcgag aatctacgtg atagacggaa atagcctgtc 
ttttgaagat agcggaagag ccacctttaa actgtcatat agccatgact 
tttacctacg cttgcaagta gagcaaaagt tctaaccatg ctaccttata 
tttgtcaagc cctacaagaa ggtagatact tcaggaattg acgaccgagc 
ttgccagcaa tcttcaaatg cttgaagaca tattagaata tggcgcagaa 
aacattttat gacttaatat gggaggcaag tgctagcaat tcgctaaagg 
aaggaaactg atgaaggaaa aattgagcct aaacttttcc tcaactgtct 
tcaccaggaa gcactatgta gaaatgtctt tcgaagaact tgaggcccat 
atctaggtgt ttgcgaaagg tatctaaaaa gggctcaaat gcgcgtgtct 
agggtcaaac aagttgagtg atttagtatc atttcaaaaa gacattcgaa 
tatatcttgt acggcgaaga aattggtctt atgaatgttt atctcaatca 
aaacttcggt ttcaacagtc tggaaaaccc tcactcaaaa agggctcgtt 
tgttcgagat gataaggagt ttctgtctaa tgagtcgagg tggaaaaggc 
acacttgttt tgatggttac taaaattgac aagcgaagca agttgctaaa 
ttgagtttga gaaaatgact gacgcgcagt tgaaaaggca ttttgtgtct 
cgacatgatt gacatggtta tccagttctg tctaaacgat tactctagaa 
ctgtcgcgat tgaaaaaggt tgacgcatca gtagttgaat ccattgtcaa 
tcctcagcct agttgatgat gtattggaat ataggccgga gcaggcaatt 
agccaaagga gaaagtccta ttggattgct taccttgctt tatcaaaatt 
ctaggagccg atgagcctaa agaagccaat ctaggcatta agcagttctt 
actttcaata cgagctggac tcagcctttg aaggcatggc tattttaggt 
gaatggtcgc tatacagaaa gttcagtggt ctatatttct ttgtataaaa 
ataagctgaa atctgegtat attacagtat aagcaaagga ggacagccta 
agcccgcaaa aggtgagagt agttatggtc gggaatattg aatttctcga 
gaacagaaac ttccatcagt tatattatag aaaatgaaag gggtctaata 
ttcaagaaag cagtaacaga aacaatcaat cgtgacggta tcgagaacct 
ataccaattt cttctcaagt ccagcaagca ctcgatacca tggaagctat 
ctcattaaac gtgctcaatc aactactttt cgaaatggat accatggtag 
tacccaatgg aaacagttgc aatcgtagca ctatttcacg acctttgcaa 
ctgaaaaatg gcgcaagaac agcgacggtg aatgggaaag ctatttagca 
acttacaatg ggacatggtg caaaatctaa tttccttctt caacgtttca 
gctcaagcaa ttttctggca tatgggagcc tatgatatta gtccttatgc 
cagccttcga aactaatcca cttgcattct taatccatcg cgcagatatg 
aaacgaaaac ttcgaatact ctcaaggtcc agttgaacaa gaggctgagg 
aaacctaaga gttcaactcg taagaaacct gcgcctaagg aagaaaaagt 
caaaagctgg aatcactcga cgtcgcaaac ctgcgccaaa agaggaagag 
gcctaagaaa gcatcttcta aaattcgaat gcctaaaaag actgaaaagg 
gacgagccga aagttgaaga agcagaggac gacaatgtgg tggtacctgc 
actacttcta cagtgaagtc gctgacgttt actacaagaa agatgtcgac 
cattcttgta gacgaagaag agtacatgga cgcaatgtgt cccgtattag 
cttgacggca aggttcacaa attagcaaaa ggtgaacgct tgccggaaga 
aacctatcac tgaagcagaa tacatcaagc gaacagaaaa acctaaagca 
aactccagcg ccttctcgcc gccctcgccc ttaaaagaaa ggttgaaata 
aacgaaacat tcaatactag aattttcaat gaagatgaaa gtggctatgt 
aggagattcg cgacaccgca gcagctatta gcaatcgagc ggtagaaaag 
agtcgctaca gttatggctc ttcccgtttc tcacgcagaa gatttaggca 
tctcgactgg aagcatttcg tgaagctgtt caagaggctc tcgagaatga 
acgttatctt aggtcttatc gacgttgaca aaaaaattgg caaccttgca 
agcatcataa tggaacgaat aaagacgcta tttcacgtga tttatgctaa 
cagctttgtt cgataccgtt gatgattatg atgacgttat agaggacatxr " 
tgacctttat aatcaaagga gcattagaat ggcgcctcac aatcctgaca 
actgacattt tactacgact agatgatatt atctacgtcg acgcaacttg 
agcctattgc atgaacaatc agcgaaagca aacgaacaaa cgaatcgtcg 
cgtgcaagag gtcgaataaa cttccttctt gctgtaaagg accacggcga 
cctttgtggg atacattgac aatctagtcg aatgttttcc tgaaagccaa 
tgtattagat gaccttccag tcactaatgc ggccgctgaa attggatacc 
cttcgagaca aagcagttga aacacttgaa gaaattttag atggggataa 
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15401 cattattcgc tctaaacacg gaatcgaaat taaggagaaa cttgatgaat tatatggtaa aagtcattct 

15471 agttagtgtc cttgtactgt cagccttttg catgacttgc tcaatggtct atttggttac aggtaagcaa 

15541 gaggaccacc gtagtaccgt cgcccttgta tttggcgctc tcgtaagctc tgcggcgttc tattcgacac 

15611 tctttatcct cgcccatctg ccatgacatc acgcgcatac aaaccaattc ccacgcgcag agctagtgct 

15681 aaacaagaga aggcagttgc taagcagttg ggaggaaaag tacagcctaa ttcaggagcc actgactact 

15751 acaaaggtga cgtcgtaaca gactcaatgc ttatagaatg caagacagtt atgaagccac aaagttcagt 

15821 cagcttgaaa aaggaatggt tcctaaaaaa tgaacaggaa aggttcgctc aaaaactcga ctattctgct 

15891 atcgctttcg actctggtga cggaggcgaa cagtatatag caatgtctat aagtcagttc aagcgaatat 

15961 tagaggatag aaatgataac cttatttaaa ataaacagtg aaggaacagc tactccaatt aaagggtcag 

16031 ccatgcaact gtacgcagac cttattccta tacaagagga cgatatacag ttcgttgata taactggact 

16101 tgaccctatt gttcgagaaa acgcacttga gctcatttca cggagccgtg taggagtttc aaaatatggt 

16171 acaaacctcg accagaatga tgtcgacgat ttcccacagc acgccaaaga agaagcgctc gaccttgcta 

16241 actacctaac caagctacaa agtcaacaaa agcaaaataa atagacctat ttctaggtct atttttatta 

16311 ttgataaatt ccagcaatct gacgagcgca atcttctagc gcagatacta ggtggcggct ttcttgttta 

16381 ccttgttcat ttcttgcttc aattctttcg ttaaggcgtt cgattcttgt agttaatttc ttgatgattt 

16451 caattctagc atcaacttcc acgtcgcgag taagtgtgac tccagtttca gcgacaggac atgctttgaa 

16521 tactgcaatg tcaagttcgc tctttctaat aactgagcct aggtctaagt acaagttagg attgattcca 

16591 gtgaccttat attgtttctc agtttctttt acaggaatgc tttcatagtg gaaagtgtag ttcttgtgac 

16661 cgtctttcca atctgctgta agataaccga aataaagtgt tgtttccata attgacctct ttctgcgtcc 

16731 ttgacgcttg ttttatttat attatgatta tacgataata aaggaataaa gtcaagcact ttttacaaaa 

16801 aagttgaact tttttaaata tttttttttg aaaataaaaa gccctaataa tagagctttt agtttagcag 

16871 aaaattaagt ccatcttcat aagcaagaat ctgtccgtac tggtaagaaa tagctgattc aatatccggc 

16941 atttcgtgga ctcctttttt aagttcgtcg atagtacagt tacaatgacc tattcttgac tgaagttcct 

17011 caatcctttc gagtcgcttt tcattttgtg taccaattgt tttcgagtct aggtgagtga aggaacttgc 

17081 aatagtttga atggcttcaa aaaagtccgt tattgaaact cctttataag aaagctcatt ccgtgtatag 

17151 caggaaagca aagcgttcca gctagtgatt tgaatttgag ggttaggaga gtttcgataa gctacaaaat 

17221 ttagaatatc tttgtagtca atatcagctt cagtatgatt gttgataaat accttcattt tataaccctt 

17291 ccaaatcttc gtcctcgtca tcgttttcat agcaggcgat aacttcaacc cactcgtcgt cctcaccttc 

17361 gtttcgaact cgaatgctaa ggacttccat gtcctcaaca tcttcgaatc cttcattagg tgcatatcct 

17431 tcccactcta aatcgtcgta gtcgaagata gttacaagac gtccgtcaaa ttttactgtt tcctttactg 

17501 ttgccatttt agtttcctcc ttatgcgata tatagtttga taatttgaga ttcgatgtca ccatagttga 

17571 tgaacttaac ttggtcgacc gtttcttcca tgtattcgcc catgtcttcg attcttccgt cttgaatcat 

17641 ttggccgttt tcgttgataa tttcgtacca ccattcatca ccgaattgtt tgattgcttc tttaactgtt 

17711 ttcattttac tacctccact ttttcgtcca ttagtgattc gttatcatag aaccgaatac gtccatcact 

17781 aagacgttct aggcttaccc atttacgacc ttgacggtca gttactttaa attcagtacc ttttgcattt 

17851 acaactttca ttcctacttg caaatcttta acttttacca ttttatatga ctcctttatt tgtttttctt 

17921 tatagtatta ttatacgata atgagtgaat aaagtcaagt gtttttgtaa acttttttaa attttttaat 

17991 tttttttttc aaaaaaataa cgagccgaag ctacgttatt tatttatctg ctcaagggct tgttgaattg 

18061 cctcatagcc tttacgacgt gctacctttc cagctttaga gccgggtgaa aagtcccaaa cagcttcgtc 

18131 tactttaaag tcatccgcct tggcatagtc gagcaggagc tggatagctt tttgccattt ccgccaattc 

18201 ttggaaaact cacctatatt agcacaacgc aaaacaagtg ctctagtatg ctggctagac ataatgaact 

18271 ctaaaaagtt gtccaaggtt ataggaaggt cctttggaaa ctcataaggc tctttgacat cgtatttgaa 

18341 aaggctgaca atttcactgt ccttaaatag ttcaccgtct ttatacataa taccttgaac aatttcagta 

18411 ggctctgctc cgctacctag tacatcgcca accgtgtgac aataggcttt aagaactgca aaaaaacctg 

18481 gggcgtctgc acgcgcaacc tggagctcct taacagtcat ccaaggctga ggtttcttac aaacaatcct 

18551 aattccttca aaatagctct tgtccgggtc aatagtgcct aacattgtca gcctgttttt atttatataa 

18621 aggtcgaaat atacttgaat ttcatctgta ttaggcagcc acttaacagt gacttttcta taagcgattg 

18691 cttttacatt tacttttttc gagagatttg tagggataag cattttcctt ttgacattta ctttttttcg 

18761 ctttttgttc tttgccatgc tagtatctcc atttctgttg gccttgcttt ctagctctgt tcagttcagc 

18831 tgcttctcgc gatgcaatag tttcgagaat atgcctgttc ataggctcac aatattccgc caaagatttg 

18901 ccagttatgg tggcgtcaat taagtaacca tctattgact ccttaccata aaatacaaaa tcgtcttggc 

18971 atactagcct tttataatag ccatttcctg cgcgtgtttc aattttaact aagctcattt tcacccaaac 

19041 ttgtagacga taaggagttc ctggaacttc gaacaggagc ctcctttttt catcgtctac ttgtittaata 

19111 catgagtttt gaaaatggat aactttccat ttattttcca tagtttcacc ttattccatg tacccgtcaa 

19181 caatccataa ttgaaaaggc ttatcttctc tataaggccg tgataatttt agtccagttc ccactacatt 

19251 tgaaagcgcg attaggtcat ctaggctgtc tagctcgagt tcgattacaa ggttgccagt atcaatttca 

19321 caaaagtaag cgacatttcc aactttctct agtgcttcac gatacctatc atatgtcgcc tcttcgtcaa 

19391 atagtcgcgc agaataaact tcgaatttca ttttagttac cgccttccaa aatttcatcg ggcataatct 

19461 ttgcattctc gccatgaaac cgcccttcaa tatacgcttc aagattgaag tcatgttgag gtctgtcaat 

19531 tccttccttc tttaaatttc gaaatgtgtc ctgaagcgca ttttttgttt gctcgctagg taggaccata 

19601 agtgaatatt cttccacctg ctttttaaat cgaatggcta aggctgacaa aaagcctttg aggtatgaat 

19671 tcttgtagga aggttcgcga gtaggaagtc ggtcaatacg gtaacgaaga taaagcaaag cagcctcata 

19741 tattttagac actaattcag cgtcttgttt ttcgccgaag aaaattattc gacttttatt caagcgcata 

19811 tcacgctgat taatacaaaa gcacctaaaa ttagtcgcga gaatatgacc aagttcacgt tcccaccaaa 

19881 atattcgacc tgcttctttc ccaacagctt gagaagtctc gaactgttta ggttcatcaa attgttcaac 

19951 ttgagcaagt gcgatattat tctttagcat caacttttga gccataagaa gggcagtttg cccctcttcg 

20021 tcactcgggt tgtcatttgc taattgaata agatttttaa ttttttcaat aattttttcg ttattcatat 

20091 tagtcacttt ctatcatatt ttcgagcttt cgaaaagtca atgtcgtcta cttcaattgt cttgtcataa 

20161 gtccaagcgc gacaagtgtc gaaatgaaat aggctacaaa acatcttttc attatggtcg aaactttcagj* 

20231 tacatttttc aatatctact tcaagttcga gaacgacaat agtatcaaca tttcgaagcg ataaaaaggc 

20301 tagagccttt tcataacttt ctgctaggta aataactcca gctgaaggct tcaatccttc agctagaatt 

20371 ttaccaagat tatcaaaatc agtggcgtga taaagtttca ttagttactt ccttacatat ctagagtcac 

20441 tacataaata gaagcagttt tatcttccaa gtcctactca atagcttcct cttcgctgag tttttcgagt 

20511 tttaaaactg tcgcttcagc tacaacatta gcaaagttcg aaccgttgag aatgttttcg atatttcctg 

20581 cgcctaagac ttcagcttgg tcattgttca ctaccattag gtattcatta gtaagtgctt tagcaaagtt 

20651 tgaaaatttc attttatttt ccctttattt gtttttcttt atactattat tatacaataa tgattgaata 
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20721 aagtaaagca ttttttataa aaaagttgaa ctttttttac aattttttga actatttaaa aattataaaa 

20791 tgggtggaaa atttaggcga caatttatac ccattttcaa cctcatttat aaacaaccta atatagaaaa 

20861 ggacttaata agtaaataaa aaagcgccct gaaaatacct acaaatccca tagtccgtaa gtaaaaacaa 

20931 aaattagggg cgacataaaa gtcgagcact atcttaatct attaccagtc tcatatacaa tcgacacaga 

21001 tctagcaggc ttttagcaaa cccccgaaca gcatgaaaaa gcatacaatt agaggaacag attatagaaa 

21071 aagcacttcc acaaacaagt tctcaaaacg ctctcaaaaa ccgtaaaatt agtaagtttg aacttttcga 

21141 acttctaaac tcttcgaata accgagccta atttagaggt cgaaaaactc aatttctcga aaagtcgaac 

21211 ctgctcgaaa acctcaaaac acccgaaaag tcgagcatag aaaggggtcg aaaagtcgag aatgctcgaa 

21281 aaactcaacc ggttcgaaaa cctcaatcct tcgaaaagtc gaaccattcg aaaagttcaa aagttcgaaa 

21351 aactcaacca ttcgagagta ggaattaagg acataccagt tcaacctttt tagcttcaaa atcactcttt 

21421 ttctcattat aggactataa attcagtcaa ttgtaagtca cgcgcaaatt tgttacaatg taaacgataa 

21491 aatataaagg agggtcaata aatggcgaaa gctactggac caaaagttcg aagaggaaaa actcctccac 

21561 ggccaaaaga caaaaaagga atcaaagcaa atgcgcgtgt caataaagac cagttcgtag agtatgacta 

21631 taaaggcatc aagatgacaa ttaaggaacg tgatgctaga atgaaattgg aatttattag aggcatgact 

21701 atccaggaaa ttgcagcccg ctatggatta aatgaaaagc gtgttggcga aatacgggct cgcgataaat 

21771 gggtgaaggc taagaaagag ttcgagaatg aaaaggctct tgttactaat gatacattga ctcaaatgta 

21841 tgcagggttt aaagtctcag tcaatattaa atatcacgcc gcctgggaga aactaatgaa catcgtcgaa 

21911 atgtgtttag ataatcctga cagatattta tttactaaag aaggaaatat tagatggggc gcattagatg 

21981 tcctttcgaa ccttatagat agagctcaaa aaggacaaga aagagcgaat ggaatgcttc cggaagaggt 

22051 tcgatataga ctacaaattg agcgcgagaa aattacattg ctccgggcca aaatgggcga ccaggaaatt 

22121 gaaggcgagg ttaaagataa cttcgtagaa gcactagata aagcagctca agccgtttgg caagaattta 

22191 gtgacgcaac aggttcctac attaaaggag tgactgataa tgacaataag cctgagaaat aaactaccta 

22261 agttcaactt cgtccctttt agtaagaaac aactccagct cctaacatgg tggacaaagg gctcaccttt 

22331 tcgaactttc gatatcgtca tagcagacgg ttccattcgt tcaggaaaaa cagtatcgat ggctctttca 

22401 ttttcccttt gggccatgac ggaattcaac ggacaaaact ttgccatctg tggtaagaca attcactcag 

22471 ctcgacgaaa tgttattcag cctctaaagc aaatgctcac aagtcgcggg tatgaaattc gagatgttcg 

22541 aaatgaaaat ctacttatta ttagacactt tagaaatggc gaagaaattg tcaactactt ctatatattt 

22611 ggaggaaaag atgagtcgag tcaagacctt atacaggggg taacattagc aggtatcttc tgtgatgagg 

22681 tggcactgat gcctgaatcg tttgtcaacc aagcgacagg gcgctgttcc gtaacaggtt cgaaaatgtg 

22751 gttctcttgt aacccggcca atcctaatca ctacttcaag aagaactgga ttgacaaaca ggtcgaaaag 

22821 cgtatcttat atcttcactt tacaatggac gacaacccta gcttgacgga tagcattaaa aggcgctatg 

22891 agaaaatgta tgctggagtc ttcaggaaaa gatttattct cggcctttgg gtaacagcag atggtctagt 

22961 ttattcaatg ttcaatgaag agcagcatgt caaaaagctc aatatagaat tcgaccgttt attcgtagca 

23031 ggcgactttg gtatctataa tgcaacaacc ttcggccttt atggattctc gaaacgtcat aagcgctacc 

23101 atctaattga gtcatactac cactcagggc gcgaggcgga agagcaacta actgaggcgg atgttaattc 

23171 gaatattcaa tttagttcag ttctacaaaa gactactaaa gagtacgcaa atgatttagt cgatatgata 

23241 cgaggaaagc aaatcgaata tataattctc gacccgtctg cttctgctat gattgttgaa cttcaaaagc 

23311 atccttatat agctagaaag aatatcccta tcattcctgc tcgaaatgac gtgacgcttg gcatttcatt 

23381 tcacgctgaa ctcttggctg agaatagatt tacactcgac cctagcaaca cgcacgacat tgatgaatac 

23451 tatgcttaca gctgggacag taaagcgagc caaacgggag aagatagagt cattaaagag catgaccact 

23521 gcatggatag gaacagatat gcctgtctca ctgacgctct aatcaacgat gacttcggtt tcgaaataca 

23591 aatattatcc ggaaaaggcg ctagaaacta actaaacact tttatagaaa ttagtgtata atataagtag 

23661 gaggatttta aacatggcta aaaaatcaaa agctatctca cacacagacg aactgattag tcagtcgttt 

23731 gacagcccct tggcaaagaa tcaaaagttc aagaaagagc ttcaggaagt tgaaaagtat tatcaatact 

23801 tcgacggatt tgatgtcacg gacttgaata ctgactatgg gcaaacatgg aagattgacg aagactcagt 

23871 cgactataaa cctactcgag aaattcgaaa ctatattcga caacttatca aaaagcaatc acgctttatg 

23941 acgggtaaag agccagagct tatctttagt ccagttcaag acaatcaaga tgaacaggct gagaacaagc 

24011 gtattctatt cgactctatt ttaaggaatt gtaaattctg gagcaaaagt acaaatgcat tagtcgacgc 

24 081 cacagtaggt aagcgggtat tgatgacagt agtagcaaat gccgctcaac aaattgacgc ccagttttat 

24151 tcaatgcctc agttcaccta tacagttgac cctagaaacc cttccagctt gctttctgtt gacattgttt 

24221 atcaggacga gcgtacaaaa ggaatgagca ctgaaaaaca actttggcat cattatagat atgaaatgaa 

24291 agctggaaca agtcaatcag gaattgcaac agctttagaa gacattgaag aacaatgttg gctcacttat 

24361 gccttaacgg atggagagtc gaaccaaatc tatatgacag aaagtggcca aactactatc aaggagacag 

244 31 aggctaaact tgtagaaatt gaagacaacc taggaaacaa gattgaagtt cctttaaaag ttcaagaatc 

24501 cgccccaacc ggcttgaagc aaattccttg tcgagttatt cttaatgaac cattgactaa tgacatatac 

24571 gggacaagcg atgtcaaaga ccttatcaca gtagcagata acttgaacaa aactattagt gacttacgag 

24641 attcacttcg acttaaaatg ttcgagcagc ctgttatcat tgatggctct tctaagtcaa ttcaaggaat 

24 711 gaagattgcg ccaaacgctt tggtcgacct taagagcgac cctacttcct caatcggcgg tactggaggc 

24781 aagcaagctc aagtcacttc catttcagga aacttcaact tccttccagc ggctgaatat tatttagagg 

24851 gcgctaagaa agccatgtat gaactaatgg accagccaat gcctgaaaag gtacaggagg cgccatcagg 

24921 aattgcaatg cagttcttat tctacgacct aatttctcga tgtgacggaa aatggattga gtgggatgat 

24 991 gctattcaat ggctcattca aatgctggaa gaaattttag caacagtgaa tgttgacttg ggaaatattc 

25061 ctcaagatat tcaatcaagt tatcaaacac ttacgacaat gactatcgaa caccactatc caattcctag 

25131 cgatgaactt tctgctaagc aacctgcgct cactgaagtt caaactaatg tacgcagcca ccaatcttac 

25201 attgaagaat tcagtaagaa ggaaaaggcg gacaaggaat gggaacgcat tttggaagaa cttgctcagc 

25271 Ctgacgaaat ctcagctgga gcattgcctg tattagcaaa cgaattaaac gaacaagagg agcctcaaga 

25341 cgaaacgagt gaagaagacg aagttgatga caaagaaaaa gaacaaactg aacaaccaac cgaagaagga 

25411 gtcgacccag acgttcaagg ttaattgtga ccattgtgag cataagttcg accttacatc taaacagatt 

25481 atttcgaaac atatcgaaaa gggcgtagag tggagattct tcgaatgtcc taagtgccat tatcggt&ca 

25551 ccacttatgt aggaaacaag gaaattgaaa accttattcg atttagaaat acttgtcgag ctaaaatgaa 

25621 gcaggaactt caaaaaggag ctgctgctaa tcaaaacact taccattcat atcgaattca ggatgagcaa 

25691 gctgggcata aaatctcagg gcttatggcg aagctaaaga aggagataaa cattgaaaaa cgagaaaaag 

25761 aatgggtatc tatatagctg ggaaaaggct attcatgaaa ataatattcg tctaaccctt gaacaggaac 

25831 aagctgtact gaaagccttc agcgatgcag gaactgattt aattgcaaag attaaaaagt ctcgaaatgg 

25901 atacttgcct aaaagaatct ataaagacta cgcttacgac ctgcacgctg ttcttgttca actaatgact 

25971 gaatactctc ataaggcggc aatgaacgca gtagatggcc aggtagttca tattctacaa gtattagcag 
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26041 aagatggaaa tgctacggct gaaaagttcg 

26111 agcagccgag gcagttgtca aaggtgaaat 

26181 tcagccgcac gcgcaggaaa tgatgttcaa 

26251 cagatatggc taaaatgctc gagaaatata 

26321 agctgagaag ctagggaaac ctgctgctca 

26391 cgaactacca ttagccattc cgccacagct 

26461 aagttcaatg gcattctgtt cacgccccag 

26531 atttcctatc gaagaatgtc ctttcgacca 

26601 tcactcgaag aaatcgctga tgagttgaga 

26671 ggtacgacga tctaagttca ggaaaagttg 

26741 tcggttcaat accgagtctt tttgtctata 

26811 atattcagtt atgttataat ataagttgaa 

26881 tgttccaatt aaataaaaac agcagattca 

26951 tcaattagaa gacttgttaa aaggtctaga 

27021 acttcgaaag aactcgatgc taaaattttc 

27091 tcgatgaagt tgttcaacag cgcgatgcag 

27161 gctttctaaa caggtcaaag ataacggtga 

27231 aagcagtctc aacttgcaaa aggcgctgtg 

27301 ctccagcagc agacattctt ggatttatga 

27371 aggtcttgat gaagagttga aagctgttcg 

27441 gcagaacaag aggctcaagc taagtcgcca 

27S11 gtggtgttcc cgaacctcgt gaaatcggct 

27581 agcacaagaa caatcatcat tctttaaata 

27651 actgatttta atcaaaccac tcgaagcatt 

27721 ttccagctac cgcagcaact caagtaggga 

27791 tactacattt gaaggacgca aaactggact 

27861 ttcgctgacc aagaagtgtt tgaaggtgaa 

27931 aatatgcagc ccttcgaaaa gttggcgatg 

28001 ataggaggaa ttatagatga atatttatga 

28071 cttccttcaa acgctcttca ataccttgga 

28141 tttcatggct caagggtgca aataatttgc 

28211 tcctcgtgaa cgtgctggat ttagcaaaca 

28281 ggtgaaaaag accgtcaaaa cttgcaaatg 

28351 ctcaactcta taatgatact aagaaccttg 

28421 attgctccaa tacggcaaat tcactgtcaa 

28491 atggatgcta agcaacaata tgcagtcact 

28561 acattttagc agcaatggat gacatcgaaa 

28631 aaacacttat aaccaaatga ctaagagtga 

28701 tgggaaaact tcttgcttct tgcaagtgac 

28771 ctgtctactc taagaaaatt gctcagttcg 

28841 gttcaacttg attgacgacg gtaaagtggt 

28911 actactccag aagcattcga cttggcttca 

28981 ctaccgttac aacttatctt gaaaaacatc 

29051 atcattcgaa ggaattgact atgtaggagt 

29121 aagctcttag caccttaatc gtttccggag 

29191 gcttgcttcg tctttaattg aacgcaattt 

29261 gaaactgttc ctcaaacaat tgaatcagtt 

29331 cggctaaaac cgttcctgag ctcgttgaat 

29401 aaaaagcgaa tatatcgacg ctttaattaa 

29471 tgaattagtc aaaatcaata tcgataacga 

29541 cttttagaca agcataaatc tgtcgcctat 

29611 tggtaaccct tggacctatc agtctaaaag 

29681 tgaccaatat aagcaagaac agcttgaaac 

29751 agggctgatg ggacatgagt tatgacgtga 

29821 tcctactaaa atcaaggtac tccgaaactc 

29891 gcgaatgaag tcgtagcaga cgaccttgtt 

29961 attctactga cgcgggaaaa atttttgccc 

30031 aatcattcaa cgagccgata ctatcgaaat 

30101 aatcttctcg agcaagacat tttgatagaa 

30171 agtatggaag cctgaagaat ttgttagtaa 

30241 acagtctgcg aagtcgctgc tactaagatg 

30311 cagggaatgc tcgacagaaa ctcaaaggag 

30381 atcacatcac atggactacg ggttttggct 

30451 gctgtagaag acaatgtcga agaacttttt 

30521 taaacgaacg acaatgatgg acagattgaa 

30591 cttccaggag ttgaatttga cgagcaagat 

30661 atagaatgcc cagcgcaaca aacagcctag 

30731 aaactcaatt attggtatcg acgaatatag 

30801 gtaacctatg cagaaactgg tgactacttc 

30871 gaattccaca aggaggaaac taataatgag 

30941 gagcttgacc cattgactca gttgccaaaa 

31011 cagaactcga agccgtgacc tcggagggaa 

31081 cgtgcgtact ccagaccttt tatacggtta 

31151 atggccctaa ttgaaggtgg tacagtacgt 

31221 ttgcacaagg tgcttctaat atgaaaccat 

31291 aattgtcaac tacgtgaaaa tcactttgaa 
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aaaaggaagt cagggctgca tccttagtat tttcacgaag 
ctataaggac ggcaaaaacc tctcgaaacg tgtttggtct 
caaatagtca cacaaggcct agcaagtgga atgtctgcta 
tcgaccctaa ggttcgaaaa gattgggacc ttgataagat 
taaatatcaa aatctcgaat acaatgccct tcgacttgct 
ggagtgagac aatggggcaa ggttaatcct tatgctcgaa 
gtcgaacgtg tcaagcgtgt atcgatttag atggtgaagt 
tcctaatgga atgtgctacc aaactgtatg gtacgaaaac 
ggctgggcag acggagaacc taatgatgta ttagacgaat 
agaaatacag cgacctcgac tttgttaaaa gctattaggc 
aattgtctaa tttcgagaac cttcgaaaag tagtaaaatg 
aaggaacctt gtcgccttaa tgactcgaaa ttggtttcac 
gccggagggc ggaaaactca ggaggaaaat aaatggctta 
tgaaccaacc atcaaacagg tgaaggaaat tatttcgaaa 
attgacggcg acggtcaaca ttttgtacct cacgcacgtt 
ctaacggctc aattaattct tataaagaac aagtcgcgac 
tgcgcagacc accatccaaa accttcaaga gcaactcgac 
attacttcag ctcttcatcc gttgattagt gactccattg 
accttgacaa cattacggtc gaaagtgacg gtaaagttaa 
tgagtctcgt aaatacttat tcaaagaagt cgaagttccc 
gccgggactg gaaatttagg aaatccaggt cgtgtcggtg 
cttttggtaa gcaacttgct gctgctcaac aaacggcagg 
ataggaggaa ctaactatgc ctaatgtgcg agttaagaaa 
gtcgcaattc ctgaccacta cgttgctttg gctgctcaaa 
acaagaaata cattcttgcc ggaacttgcg tgaaaaatgc 
cgaagtagta tctaccggtg aacaattcga cggagttatc 
gaaaaagtaa ccgtgacagt attagttcac ggattcgtca 
ctgtgcctga atctaaaaac gcaatgattc ttgtcgttaa 
ttatatcaac gcaggggaga ttgctagcta cattcaagca 
ccaactcttt tccctaatgc tcaacaaaca gggacagaca 
cagtaactat ccagccatct aactacgacg cgaaagcaag 
agccactgag atggcattct tccgtgagtc tatgcgactt 
ctattgaacc aaagttcagc tcttgcccaa ccacttatca 
tagacggtgt tgaagcgcaa gcagaataca tgcgtatgca 
atcaactaac agcgaggctc aatacactta cgactacaac 
aagaaatgga ctaacccagc tgaaagtgac cctatcgctg 
atcgtacagg tgttcgccct actcgaatgg tcttgaaccg 
ctctatcaag aaagctcttg caattggtgt tcaaggttct 
gctgagaaat tcatcgctga aaaaacaggt cttcaaatcg 
ctgacgctga caaacttcct gacgttggta acattcgtca 
attgcttcca cctgacgcag ttggtcacac ttggtacggt 
ggcggaacag acgctcaagt tcaagttctt tcaggcggac 
ctgtcaacat tgcaacagtt gtatcagctg ttatgattcc 
tctcacaact aattaggagg tcgctatatg gctacattga 
cagtagtgca ttcagggtcg gtattttctt gccctgaagc 
tgcgttcgag attaaggcgg ctgaagatgg agaaacggta 
gaagaaattg acgaagttga acaaatgcgc gaagagtatg 
tagcaagagc taatggaatt gacatttctt caatttctcg 
gtacgaacta ggagagtaaa atggcagccc aaacggacat 
taattctccg ccaccaatga ctgaccaaag tatctcagct 
gttagttata tgatctgctt aatgaagacc cggaatgacg 
gtgacgcaga ctactggaaa caaatggcgc aattctatta 
tgatgaaaag tcgaacgctg gttcgacaat cttaatgaaa 
attatgttaa gaaccaagtt cgtagagcca ttgaaaccgc 
ttgggtcagt gatggatatg gaggaaagaa aaaggataaa 
tgtttagttg ataattcaac tgttcctgac cttttagcca 
aaaatggagt gaaaattttc attctatatg atgaaggcaa 
taaaaactca ggaagacggt acagggtagt agaaacccac 
cttaaattgg aggtgaacga ctaatgtctc agcctgaatt 
ctgtgaacgg tatcgaaaca agtttcaagt cgctgtcata 
gaagaatacg caaagacgca tgctattcgg acagaccgta 
aagctgcttg ggtaagcgca gaccaaacca tgatagctgt 
agaactagct catggtcgaa aatacaaaac tctcgaacag 
agagcgctga gaaggttatt agactaggag tgaacatgac 
ggaaattctt cctacatttc agctctcgcc tgctcctatg 
acagataggc cggatgacta cattgttctt cgatatagtc 
gaagttttgc ttattggaaa gttcaaatct acgtccattc 
cagaaaggtc cgaaacatta tcaaggacat gggctacgaa 
gacacaatgc tttctagata ccgactagaa atcgaatata 
taaagacatt ctttacggaa tcaagctcgt gcaaarcgag 
gtcggcggag ctaactttgt cgtagatacg gcagaaacag 
ctgaagatgt gaaacgcaat gacacgcgca ttcttgctat 
tgacttaaca ttcaaggaca acacgtttga ccctgaaatc 
caacaaggcg gaactattgc tggatacgac accccaatgc 
ttagaacgaa catctatgtg ccaaactatg taggtgactc 
taactgtacc ggtaaagctc cagggctttc aatcgggaaa 
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31361 gagctctacg ctcctgagtt caacatcaag gcacgtgaag caaccaaagc aggtttgcca gttaagtcaa 

31431 tggactatgt ggcacaactt ccagcggttc ttcgtcgcgt gacattcgac ttgaacggtg gaacaggaac 

31501 cgccgacgca gttcgagtcg aagcaggtaa gaagatttct ccaaaaccag ttgaccctac cttaacaggt 

31571 aaggctttca aaggctggaa agttgaagga gaatcaacta tttgggactt cgacaaccac acgatgcctg 

31641 accgagacgt caaactcgta gcacaatttg catagaaatt tagaaagaag ggtctgttat gactaatatt 

31711 atcacagctg agcagtttaa gcaactcgca tttcaaatca tcgcacttcc aggactttca aaaggtagtg 

31781 aacctatcca tgttaaaatt cgagcagcag gtgtcatgaa cctaatcgct aacgggaaaa tccctaatac 

31851 gcttttaggt aaagtgacag aactgtttgg agaaacttcg acagtcacta aagacaatgc tagtctagca 

31921 tcaattactg accaacagaa gaaagaagcg ctcgaccgat tgaacaaaac cgataccggt attcaagaca 

31991 tggccgaact tcttcgagta ttcgcagaag cttcaatggt agagcctact tacgctgaag tcggcgagta 

32061 tatgacagat gagcaactta tgacaatctt cagtgcaatg tacggtgaag tgactcaagc tgaaaccttt 

32131 cgtacagacg aaggaaatgt ctaatgtcat agcagtcgct actgaatttc atattagacc tagcgaggtg 

32201 gtcgggatgc aaactgattt aggcaaatac tgcttcgacg cagcagccgt tgcttatatt agatatttgc 

32271 aggaagacaa gactcctagg tatcctggtg acgaaaagaa aaatccagga ttgcaaatgc ttatggagtg 

32341 actattttca gtcgctcctc tttttgtata tagaaaggaa attacatgga ttttgggtca actgcagcaa 

32411 aaatgacttt ggatatctca aacttcacaa gtcaattaaa tcttgctcaa agtcaagcgc aacggctcgc 

32481 actagagtct ccgaagtcct ttcaaattgg ttctgcttta acaggattag ggaaaggact tacgactgcg 

32551 gtcacccttc ctcttatggg atttgcagcc gcctctatta aagtagggaa tgaattccaa gctcaaatgt 

32621 cccgtgttca agctattgca ggagcgacag cggaagagct tggtagaatg aagactcaag caatcgacct 

32691 tggtgctaaa actgctttta gtgcaaaaga ggcggctcaa ggtatggaaa atctagcttc agccggtttc 

32761 caggtaaatg aaatcatgga cgctatgcca ggggtacttg acctggctgc cgtatctgga ggagatgtgg 

32831 ccgcgagctc cgaggccatg gctagttcac ttcgagcctt tggattagag gcaaaccagg cgggtcacgt 

32901 ggctgacgta tttgctcgag cagcagctga tacgaacgca gaaactagcg acatggcaga ggcgatgaaa 

32971 tacgtcgcac ccgttgctca ctctatgggc ttgagccttg aagaaacggc tgcgtctact gggattatgg 

33041 ccgacgccgg tattaagggc tcgcaagccg gaaccacgct tagaggcgct ctctcgcgta ttgccaaacc 

33111 tacgaaagcg atggtcaaat caatgcagga attaggagtt tcgttctacg acgcgaacgg aaacatgatt 

33181 ccactaagag aacaaatcgc tcaactgaaa acagctactg caggactaac acaagaggaa cgaaatcgtc 

33251 accttgttac cttgtatggc caaaactcgt tgtcaggtat gcttgcacta ttagacgcag gtcctgagaa 

33321 attggacaag atgaccaatg ctctcgtgaa ctcggacgga gccgctaagg aaatggcaga aactatgcag 

33391 gacaaccttg ctagtaaaat cgagcaaatg ggaggagctt tcgagtctgt tgctattatt gttcaacaaa 

33461 tccttgagcc tgcacttgct aaaatcgtgg gagcaatcac aaaagttctc gaagcattcg taaatatgtc 

33531 acctatcggt caaaagatgg ttgtcatatt cgcaggaatg gttgcagccc ttggaccact gcttctaatt 

33601 gcaggaatgg tgatgacaac tattgtcaag ttaagaattg ctattcagtt tttaggtcca gcatttatgg 

33671 gaacgatggg aaccattgca ggagttatag caatattcta tgctctggtc gccgtgttca tgatagccta 

33741 cacaaaatcg gagagattta gaaactttat caacagtctt gcgcctgcta ttaaagctgg gtttggagga 

33811 gcgttggaat ggctacttcc acgactgaaa gagttaggag aatggttaca gaaggcaggc gagaaggcga 

33881 aagagttcgg tcagtctgta gggtctaaag tgtcaaaact gctcgaacag tttggaataa gtatcggtca 

33951 ggcaggaggc tcgattggtc agttcattgg aaatgttctc gaaaggctag gaggcgcatt tggaaaagta 

34021 ggaggagtca tttcaattgc tgtttcactt gtaacaaaat tcggtctcgc attcctaggg attacaggac 

34 091 cactcgggat tgctattagt ctgttagttt catttttgac agcttgggct agaacaggtg agttcaacgc 

34161 agacggaatt actcaagtat tcgaaaactt gacaaacaca attcagtcga cggctgattt catctctcaa 

34231 taccttccag tctttgtcga aaaaggaact caaattttag ttaagattat tgaaggaatt gcatctgctg 

34301 ttcctcaagt agttgaagtg atttcacaag tcattgaaaa tattgtgatg acaatttcga cagttatgcc 

34 371 tcaattagtc gaagcaggaa ttaagatact cgaagcgctt ataaatggtc ttgttcaatc tcttcctact 

34441 atcattcaag cagctgttca aattatcact gctttattca atggtcttgt tcaggcactt cctacgctta 

34511 ttcaagcagg tcttcaaatt ttgtcagctc tcataaacgg actagttcaa gcgcttccgg caattattca 

34581 agcagctgtt caaattatca tgtcgcttgt tcaagcacta attgaaaact tgcctatgat aatcgaagca 

34651 gcgatgcaga ttataatggg tctagtcaac gcactgattg aaaatatagg acctatctta gaagcaggga 

34 721 ttcaaattct aatggcttta atcgagggac ttattcaagt gcttcctgaa ctaattacag cagcgattca 

34791 aatcattact tcactattag aagcaatctt gtcgaacctt cctcaacttc tagaagccgg agttaaattg 

34861 cttttatcac ttcttcaagg gttgctaaat atgcttcctc aactaatcgc aggggctttg caaatcatga 

34 931 tggcacttct taaagcagtt atcgacttcg tccctaaact tcttcaagca ggtgttcaac ttcttaaggc 

35001 attgattcaa ggtattgctt cacttctcgg ctcactttta tcgacagctg gaaacatgct ttcatcatta 

35071 gttagcaaga ttgctagctt tgtgggacag atggtcccag gaggtgcgaa cctgattcga aacttcatta 

35141 gtggtattgg gtcaatgatt ggttcagctg tctctaaaat tggcagcatg ggaacttcaa ttgtttctaa 

35211 ggttactgga ttcgctggac aaatggtaag cgcaggggtc aaccttgttc gaggatttat caatggtatc 

35281 agttccatgg taagttctgc ggtaagtgcg gcggctaata tggctagcag tgcattaaat gccgttaagg 

35351 gattcttagg tattcactct ccttcacgtg tcatggagca gatgggtatc tatacgggtc aagggttcgt 

35421 aaatggtatc ggtaacatga ttcgaactac acgtgacaag gctaaagaaa tggctgaaac tgttactgaa 

35491 gctctcagcg acgtgaagat ggatattcaa gaaaatggag ttatagaaaa ggttaaatca gtttacgaaa 

35561 agatggctga ccaacttcct gaaactcttc cagctcctga tttcgaagat gttcgtaaag cagccggttc 

35631 gcctcgagtg gacttgttca atacaggaag tgacaaccct aaccaacctc agtcacaatc taaaaacaat 

35701 caaggcgagc aaaccgttgt caacattgga acaatcgtag ttcgaaacaa tgacgacgtt gacaaactgt 

35771 cgagaggatt gtataataga agtaaagaaa ctctatcagg gtttggtaac attgtaacac cgtaaaggag 

35841 aaatagatgg ctagcagaca gacgctattg gtcgacggaa ttgaccttgt cgacaaaggt gcaaccgtgc 

35911 cagaatatgt aggactcact ttcgcaggat ttaaggactc aggatttaaa aaccctgaag gcatagacgg 

35981 agtattagat tctccgtcta atgctatgtc cgctcttact ggaagcgtga ccttaatgtt ccacggagaa 

36051 accgaaaagc aagttaatca aaaatacagg cagttcaaac aatttattcg ctcgaagcca ttttggagaa . _ 

36121 tttcgacact tgaagaccct ggatactatc gaacgggaaa atttttagga gaaaccgagc a"aggaaa&pt" 

36191 tgtagacgcc caagccttta aagatacttc ccttgtagtt aaattaggga ttcagttcaa agacgcttac 

36261 gagtacagcg actcaactgt tcgaaaggtt tataagtttc aacccgcttc gggaggcgat agcttaccta 

36331 acccaggaag acctactcga caatttagag tagaaataag aactacttct caaatcaaag gatattttcg 

36401 aattggcgaa aaaagttcag gacagtttgt tgagttcggt actaattcag tattgatgga aagtggctcg 

3S471 attattattc taaatcttgg aacttttgaa cttattaaaa ttagcagtgc aaatcaagcg actaacttat 

36541 ttagatacat taaacgaggc gcattcttca agattcctaa tggaaattca acaattacca ttgaataccg 

36611 agccgatgac gcagcagctt ggacctctac tcttcccgct caagttgaac tgtttctaaa tccgtcttac 
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36681 taccagaaag ggaatatatg attgacaata atttacctat gagtccaatt cctggcgaaa ttgttcaagt 

36751 atacgaccaa aacttcaatc taaccggagc aagcgatgaa acctttagca agcattacga agacgaaatt 

36821 gtgacccgag ctcgaggaaa agaaactttc acttttgaaa gtattgaaac cccatctatc tatcaacact 

36891 taaaggttga aaacattatc cagtacggag gaagatggtt tcgaactaaa tatgctcagg acgtagaaga 

36961 tgtcaaaggg cttaccaagt ttacctgcca cgcattatgg tatgaactag cagaaggctt gcctaggaag 

37031 ttgaaacacg ttgcttcttc tgtaggcgct gtcgcgctag atattatcaa agacgcaggt gaatgggttc 

37101 gactagtttg tcctcctgac ggtgctaaca aacaagttcg aagcataaca gccgcagaaa attcaatgct 

37171 ttggcatctt cgatatcttg caaagcaata caatttagaa ttgacatttg gttatgaaga aattatcaag 

37241 caagaggtta gaattgttca aaccgctgta tctcttcagc cttatgtcga gtctaaagta gactttcctc 

37311 ttgtagttga agagaatttg aaatatgtca ctaggcagga agattctcga aacctgtgta cggcttacaa 

37381 gttgacaggt aaaaaggaag aaggcagtca agagccttta acgtttgctt ctatcaacaa tggaagtgaa 

37451 tatctcattg atgtttcgtg gtttactaca cgccacatga agcctcgata tattgctaaa tctaaaagcg 

37521 acgaacattt tagaattaaa gaaaatttga tgagtgctgc gcgtgcttat cttgacatct acagtcgccc 

37591 actaactgga tatgaggctt cagcggtcct ttataacaag gttcctgact tgcatcatac tcaactaatt 

37661 gtcgacgacc attatgatgt tatcgagtgg cgaaagatat ctgctcgaaa aattgactac gacgaccttt 

37731 caaactctac tatcattttc caagaccctc gaaaagactt gatggacttg ctaaatgagg acggcgaagg 

37801 agtcctttca ggggaaactg caaatgagtc ccaagttgtt attagatacg cagatgacat tttagggact 

37871 aattttaatg cagaatctgg gaaatacatt ggtgtcctta atactaataa gaaaccgagc gaattagttc 

37941 ctgacgactt tacacggatt cgactagaag gtcctaaagg tgacgcaggt ttaccgggag ctcctgggcg 

38011 tgatggagtc gacggtgtac ctggaaagag cggagtaggg atagcagata cagctatcac ttatgctgta 

38081 tccgtttccg gaacgcaaga gcctgaaaat ggatggagcg aacaagttcc tgaactcaca aaaggtcgat 

38151 tcttgtggac taaaacattt tggagatata ctgacggctc acatgaaacc ggatactccg ttgcctatat 

38221 agggcaagac ggaaattccg gaaaagacgg aatcgcaggt aaggacggag taggtatagc cgcaactgaa 

38291 gtcatgtatg caagttcgcc atctgctact gaagctccag ctggtggatg gtctacgcaa gttcctaccg 

38361 tcccaggtgg tcagtattta tggactcgaa caagatggcg ctacactgac caaactgatg aaattggata 

38431 ttcagtttca agaatgggcg agcagggccc taaaggtgac gcaggtcgtg acggtattgc aggaaagaac 

38501 ggaacagggt tgaagtcaac ttcagtttct tatggaatca gtcccactga ttctgcgatt cctggagtat 

38571 gggcttcaca agttccttct ttaatcaaag gtcaatatct ttggactcga actatttgga cctataccga 

38641 ttcaactacc gaaacgggct atcaaaaaac ctacactcca aaagacggga atgacggtaa aaatggaatt 

38711 gctggtaagg atggggtagg aattaagtcc acgaccatta cctacgcagg ctcaacctca ggaacagttg 

38781 cgcctacttc aaattggact tctgctattc caaatgttca accgggattc ttcttgtgga cgaaaactgt 

38851 ttggaactat actgatgaca ctagcgaaac aggttactca gtttccaaga taggtgaaac aggtcctaga 

38921 ggagttcaag gtcttcaagg tcctcaaggg cttcaaggaa ttcctggacc tgcaggagct gacggacgtt 

38991 cgcaatatac tcacctcgct ttctctaata gtccaaacgg tgagggattt agtcatactg acagcggacg 

39061 agcatacgtc ggtcagtatc aagatttcaa tcccgtccat tcaaaagacc ctgcagccta tacatggacg 

39131 aaatggaagg ggaatgacgg agctcaaggg atacccggga agccaggcgc agacggtaag actaattatt 

39201 tccatatagc ttacgcttca agtgcagacg gatcacgtga gttcagtttg gaagataata atcaacaata 

39271 tatgggttat tactccgatt atgagcaagc agatagcagg gatcgaacta agtatcgatg gtttgaccgc 

39341 ctcgccaatg ttcaagtggg aggtcgaaac gagttcctta attctttatt tgaatttggt ttaaaacctc 

39411 gctattctag ttacaatcta atggacggac aagatcaaac gcaaggacag atatctgcta ctattgacga 

39481 acgtcaacgg ttcaaaggtg ctaactcttt acgacttgac tcaacatgga acggtaaacc gcagaaccaa 

39551 aaactgacct tttctttagg aggagatacg cgattaggta ctccaaccga gtggtctaat ttagaaggtc 

39621 gtatcagttt ctgggctaag gcctctagga acggagtgag cttagctgca cggccgggtt atcgtagtaa 

39691 cgtatttacc gcaaccttaa ccgatcaatg gaagttctac gattttaaat tctttgacaa agttaattca 

39761 aattgtaccg ctgaagcaat tttccatgta ttcactcaaa gttgttcagt gtggctcaat catattaaaa 

39831 tcgaactcgg taaeatctct actcctccta gtgaagcaga ggaagacctt aaatatcgaa ttgactcaaa 

39901 agccgatcaa aagctaacta accaacagct gacggcactc acggaaaagg ctcaactaca tgacgcagaa 

39971 ctgaaagcta aggctacaat ggagcagtta agtaacttag aaaaggctta tgaaggtaga atgaaagcta 

40041 atgaagaagc tatcaaaaaa tcggaagccg acccaatctt agcggcaagt cgaattgaag ctactatcca 

40111 agaacttggc gggctacggg aaccgaagaa gttcgtcgac agttacatga gctcttctaa tgaaggtcta 

40181 attatcggta agaacgacgg tagctctacc attaaggtat caagtgaccg aatttctatg ttctccgcag 

40251 ggaatgaagt tatgtacctt acgcaagggt tcattcacac cgataacggg atctttaccc aatccattca 

40321 agtcggccga tttagaacgg aacaatactc gtttaatcca gacatgaacg tgattcggta tgtaggataa 

40391 ggagaataac atgacaaaat ttaccaactc atacggccct cttcacttga acccttacgt cgaacaagtt 

40461 agtcaggacg taacgaacaa ctcctcgcga gttagttggc gagctactgt cgaccgcgat ggagcttatc 

40531 gaacgtggac ttatggaaat attagtaacc tttccgtatg gttaaatggt tcaagtgttc atagcagtca 

40601 cccagactac gacacgtccg gcgaagaggt aacgctcgca agtggagaag tgactgttcc tcacaatagt 

40671 gacgggacaa agacaatgtc cgtttgggct tcgtttgacc ctaataacgg cgttcacgga aatatcacta 

40741 tctctactaa ttacacttta gacagcattc caaggtctac acagatttct agttttgagg gaaatcgaaa 

40811 tctaggacct ttacatacgg ttatctttaa ccgaaaagtg aactctttta cgcatcaagt ttggtaccga 

4 0881 gttttcggta gcgactggat agatttaggt aagaaccata ctactagcgt atcctttacg ccgtcactgg 

40951 acttagcaag gtacttacct aaatcaagtt ccggaacaac ggacatctgt attcgaaccc ataacggaac 

41021 cacgcaaatt ggtagtgacg tctattcaaa cggatggagg ttcaacatcc ccgattcagt acgtcctact 

41091 ttttcgggca tttctttagt agacacgact tcagcggttc gacagatttt aacagggaac aacttcctcc 

41161 aaatcatgtc gaacattcaa gtcaactcca acaatgcttc cggcgcttac ggatccacta tccaagcatt 

41231 tcacgctgag ctcgtaggta aaaaccaagc tatcaacgaa aacggcggca aattgggtat gatgaacttt 

41301 aatggctccg ctaccgtaag agcatgggtt acagacacgc gaggaaaaca atcgaacgtc caagacgtat 

41371 ctatcaatgt tatagaatac tatggaccgt ctatcaattt ctccgttcaa cgtactcgtc aaaatcctgc 

41441 aattatccaa gctcttcgaa atgctaaggt cgcacctata acggtaggag gtcaacagaa aaacatcatg "* 

41511 caaatcacct tctccgtggc gccgttgaac actactaatt tcacagaaga tagaggttcg gcgteaggga 

41581 cgttcactac tatttcccta atgactaact cgtccgcgaa cttagctggt aactacgggc cggacaagtc 

41651 ttacatagtt aaggctaaaa tccaagacag gttcacttcg actgaattta gtgctacggt agctaccgaa 

41721 tcagtagttc ttaactatga caaggacggt cgacttggag ttggtaaggt tgtagaacaa gggaaggcag 

41791 ggtcaattga tgcagcaggt gatatatatg ctggaggtcg acaagttcaa cagtttcagc tcactgataa 

41861 taatggagca ttgaacaggg gtcaatataa cgatgtttgg aataagcgtg aaacagagtt tacatggcga 

41931 agtaacaaat acgaggacaa ccctacggga actcgaggcg aatggggact atttcaaaat ttctggttag 
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42001 atagctggaa aatggttcaa tccttcatta 

4 2071 aaacagctgg agacctaaca agcggaaaga 

42141 aaacttgttc ttcaaagtgg gtggaaccat 

42211 acggcatagt atatttgaga ggaaatgtgc 

42281 tcctgaagga tttagaccga aagtttcaac 

42351 ctatgtatat acactgacgg aagacttgtg 

42421 atgtctcatt ccgtatttaa tttgagctga 

42491 tgttgaacct tacaaaatcg cgccaaattg 

42561 tgtcaaaaca acgattgtga acattgatgc 

42631 gacttgtatg ctgcgaaccg tcgagaactt 

42701 tcgaagatga aattctagct gaacagtcaa 

42771 tctatgccaa tgtggctaaa cgacacagca 

42841 ctgtcctact aaataagtta ctcgaatgga 

42911 aactcttagc actcttaaac agcaggtcga 

42981 gacgtcattc aagacggaac tagaaaaatt 

43051 taacaggcta tacaactctc gaccatttta 

43121 cggaaatggt gaagttgaag ccttgtatga 

43191 gaaactatct aacgaacaat atgacgtagc 

43261 ctaatcacag gtcttggagc gttgtatcaa 

43331 caacttttgc aggtactgtt ctaggagttt 

43401 tgaggtggaa taatgggagt cgatattgaa 

43471 cttatagcac ggaccttcga gacggtcctg 

43541 ctcagccgga gcttcaagtg ctggatgggc 

43611 ggttatgaac taattagtga aaatgccccg 

43681 aaggtgctag cgcaggcgct ggaggtcata 

43751 ctacgcctac gacggaattt ccgtcaacga 

43821 tacgtctatc gcttgactaa cgcaaatgct 

43891 ctggtttctg gtacgctcga gcaaacggaa 

43961 gtcttggttc tacttegacg accaaggcta 

44031 tggtattggt tcgaccgtga cggatacatg 

44101 tcaatcgcga tggttcaatg gtaaccggtt 

44171 caacggcgac atgaaatcga atgcgtttat 

44241 cgtctggcag ataaacctca attcaccgta 

44311 agaggaggaa gctcttttct taatattgtt 

44381 gtcgtatatt actctattta cttattcgaa 

44451 gttgatatga ccctttccgc cctacataat 

44521 gcttgacaac attcactcat tatcgtataa 

44591 cattatgtca aaaattaaat tcgaaaacct 

44661 aagtttaaaa tcgtttcaat tttagcagac 

44731 aacttcacct ttcagcttca actctcgaac 

44801 agaagctgct aaacctgcta aaaaggctgc 

44871 cccaaaccta aaaaagaagt ccttgaggaa 

44941 cagttagtga gaaatctact gttcgaaaac 

45011 tcttgaaagt cgaattgttg aagcctttcc 

45081 cgctctaaga agaacttcgt tactatcgaa 

45151 ggttgacaga agaccaaaag aaacttcttg 

45221 aatttttaaa ctcgtcaagg aagaagatat 

45291 tcgctatgat tgaaatcgtt atagcacgrt 

45361 ggcaagcact gatgaagacg cagttaaaat 

45431 tcttctaata acttcgaact accttataag 

45501 ttcacatctt cggcgaactt gataaagacg 

45571 aagcaatgag cagttttcgt tcaagactac 

45641 gagcatccat gtttcctttt aggcgatgag 

45711 ttagcaggaa ggcaagtttc aaacattgte 

45781 aaaagaagta ggtattcatt caaatgagtc 

45851 ttagtgattg acggagtttc taaacgggca 

45921 ctaacattga aactcttcgc gatgctgtgt 

45991 tggaatggtt attattgacg agattcacaa 

46061 aagctccaaa gttattacaa gatgggactt 

46131 atgttatgaa gtggctaggg gcggaacatc 

46201 ccagttcaac caaatcactg gatatcgaaa 

46271 agaagaacga aggaagaagt tttagacctg 

46341 cgaaacagtc aaaaatctat aaggaagttt 

46411 gcctaaccct ctagccgaaa cgattcgact 

46481 gatgtcaagt cttgcaagtt cgaaagacgt 

46551 gcgtgatact tagcaattgg gaaaaggtta 

46621 caacctggta acaggagaaa ccgcagataa 

46691 tctgttattt taggaactat aggtgcgcta 

46761 tcttagatag tccgtggaca cgcgcagaaa 

46831 aagttctgtc actatctaca cgcttgtcgc 

46901 cggaaaggag aattagcaga ttatatcgta 

46971 atatcctgct taaatagaat gaaaactatc 

47041 acggaagaaa aactgcactc gaactagctc 

47111 tcaaattcct gaaaggacgg caaccagaat 

47181 ataatagaaa ggtatataaa tgaaattcac 

47251 acgttgaacc gctctctatc tacgattaat 
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caatgtcagg aagaatgttc atcaggacag cgaacgatgg 
ggttctattt aagcaagact tcgaacagaa taattggcag 
cactcaacct atggcgacgc attctattcg aaaactcttg 
ataaaggact tatcgacaaa gaggccacta ttgcagtact 
gtatcttcag gctctcaata actcatatgg aaatgccatt 
gtgaaatcga atgtagataa ttcttggtta aatttagaca 
aatcatgtta taatattttt tagaaaggag gtgagaacta 
tggcagagtt cactattgga caaggagctg aaaagaaact 
aaacgcagta tcaaccgtct ctgaaactct tcatgaccca 
cgagctgacg agcaaaaact tcgcgaaact cgttacgcaa 
agactgaaac agctctaaca gctgaataag gaggcgtcaa 
gtcttgacga cgattattac agcgcgcagc ggagtgctta 
aatcgaataa agccaagagc gttttagagg atatctctac 
cgggattgac caaacgacag tagcaatcaa tcaccaaaat 
caacgttacc gtctttatca cgacttaaaa agggaagtga 
gagagctctc tattttattc gaaagttata agaaccttgg 
aaaatacaag aaactaccaa ttagggagga agatttagat 
aaagaacgtg gtaaccgtag tcgttccagc agcgattgca 
tctgacacta ctgctatcac aggaaccatt gcacttcttg 
ctagccgaaa ctaccaaaag gaacaagaag ctcaaaacaa 
aaaggcgttg cgtggatgca ggcccgaaag ggtcgagtat 
acagctatga ccgctcaagt tctatgtact atgctctccg 
agtcaatact gagtacatgc acgcatggct tattgaaaac 
tgggatgcta aacgaggcga catcttcatc tggggacgca 
cagggatgtt cattgacagt gataacatca ttcactgcaa 
ccacgatgag cgttggtact atgcaggtca accttactac 
caaccggctg agaagaaact tggctggcag aaagatgcta 
cttatccaaa agatgagttc gagtatatcg aagaaaacaa 
catgctcgct gagaaatggt tgaaacatac tgatggaaat 
gctacgtcat ggaaacggat tggcgagtca tggtactact 
ggattaagta ttacgataat tggtatcatt gtgatgctac 
ccgttataac gacggctggt atctactatt accggacgga 
gagccggacg ggctcattac tgctaaagtt taaaatatag 
tctcttaatc ccgcaaggtt tcgaccctgc ggggttttgt 
gatttcaatt ataattaaat agtcaacatg attcatgatt 
ttgtggggcg tttatttttt ataaaaattt tttacaaaat 
tacaattata aaaataaata aagccgaaag gcgaggagga 
taaaaaaggc gatgttgtgc tacgagctaa atctcaaacg 
gaaaagaaag cagaccttga atcattagaa gacggaggtg 
gttggtacac aatggaagat gaaactgaac ctaaaaaaga 
tcctgcagtt gctcgacctg ctcgaaaagg tagagtcgtt 
gaaattcctg aagttaagga acagccggaa gaagttggtt 
ctgctcccaa aaaagaaagc gtgatggcga ttactaaggc 
tgcgtctact cgaatcgtca ctcagtctta catcgcctat 
gaaactcgaa aaggtgtttc tattggagtt cgcgcaaaag 
catctattgc tcctgcatct tacgaatggg cgattgacgg 
tgacaccgca atggaattga ttgaagcttc tcacctttct 
cgaaagctag gcgaggtcga accctattta ttgaaacatg 
ggcagaaaag atttccagct tgcccaacgr agtcgagacg 
tatttcaata atgttataga cgctctagat gaatgggagc 
ttcaagacta cattgactct cgaaaccgaa tagcttcttc 
tccattcgcg caccaggttg aatgtttcga atacgcacaa 
caaggtttag ggaaaactaa acaggcaatt gatattgcag 
taatcgtatg ttgcatatca gggctcaaat ggaattgggc 
agctcatatt ttaggaagtc gagtcactaa agatgggaaa 
gaagacttgc ttggtggcca cgacgaattc ttccttatca 
tcattaaata cttaaatgaa ctgacaaaaa gcggagaaat 
gtgtaagaac ccttcaagta agcaaggggc ttcaattcaa 
acaggaactc ctctaatgaa taacccaatc gatgtattca 
atacactgac tcagttcaaa gagcgatact gtatcgtcga 
tctagctgaa cttcgcgagc ttgtcaacga ctacatgctt 
cctgaaaaga ttcgagtcac agagtatgtc gacatgaact 
tgactaaact tgtccaagaa atagacaaag tcaagctcat 
tcgacaagcg actggaaatc cttcgatttt aactactcaa 
atcgaaattg tcgaggaatg tatccagcaa ggaaagtcct 
ttgaacccct tgctaagata ctttcgaaga cagtcaaatg 
gttcaacgaa attgaagaat ttatgaatca cagaaaggct 
ggaacaggat ttactttgac gaaagcggat acggttattt 
aggaccaagc cgaagatagg tgtcatagaa tCggcgcaaa 
caaaggtact gttgacgaac gcatagaaga ccttattgaa 
gatggtaagc ctatgaaatc taaaatcggt aaccttttcg 
tccatattaa ggaaagacac taaaaggaag ccggacagga 
aagagattga tatgtcacct agtgagttag cagagctcct 
cctaaaactc gacaaactgc tcaacaaaga gcaatgctca 
tgaaggaaaa aattggtata aagttggaga gatatgtcaa 
gettggtatg aagcaaaaga cctcgctgaa gaaaataaca 
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47321 ttcacttccc gcctgttctt cctgaaccta 

47391 cgaaggcgtg aacaaaccca aacgacttag 

47461 actcttgtag ggaaaactga aagggaagca 

47S31 tggagaatta aatgaaattt gaagatgaaa 

47601 tgctaccaaa ggcgacatgg agaaacaagt 

47671 aatgacattg aatctgctca aggtaagcac 

47741 acgaagaacg cttgaaagaa attatcgaaa 

47811 actttcaggg cttatcgaat acaagcctgt 

47B81 gagattgacc aagaagcaat tcttccagca 

47951 ctaaaattta gcgatatttt tggttctgcg 

4 8021 caggcaaccg ctgtctgcgt taattttaga 

48091 caaagaatag gcaattcagg aaagcctaaa 

48161 gttctacctt attcaagaag gacgtggcaa 

46231 tgaagcactt aacggaaaac aattcgaacc 

48301 gaatttattt tcaatattaa gtgcatcgat 

48371 gaacttattt aaacattgag tcgaacattg 

48441 aaatgttcga aaaagattga acctaagcga 

48511 ttggacgaac tcgaaggaaa aacgggttca 

48581 attttttaaa atgtggttta caaaatgacc 

48651 cggtatatat acaccaataa tcgagaaata 

48721 gaaaatttag ctgatagaat atggaagaaa 

4 8791 agtatttcga acctcaagtg ttagtcgaac 

48861 tcgagcaaat atagtcgaag aagttcgaaa 

48931 gggaaaacta gctgggcggt tcgacttttg 

4 9001 ttgagaaagg aatgtttgta gtgtcagctc 

4 9071 catgcaagaa tttctcgaac gtttcgagcg 

4 9141 ggaggttcct taaccaaggc ctcttatcct 

49211 tgtcgactat ttatacgact aattatactg 

4 9281 tcgtatatac gatacttcag tggttctaga 

49351 attgaatcat agatatagta acatcacaac 

49421 gcggtgtccc attgtgcagg agtgcataat 

49491 agaaagaaaa gtcagccgtc tacttgacag 

49561 caaggaaagt cctctctaca atgaaaaggg 

49631 agcttcaagt cttaaataaa gttctcgaag 

49701 agaatacttc acggattatt tagacgagta 

49771 ccggacgacg aaactattct cgaccatttt 

49841 accttatcga caagctaaaa gaggagcatc 

49911 ggacattcaa gtagatagta acattgcgat 

49981 tctaaattcg taggcggact agacattgct 

500S1 gaaaccacga cggtgaaaga cttggaatat 

50121 acttcctggt gaggatttga ttgtcataat 

50191 atgcttgcaa ctgcttggaa gaacgggcat 

50261 ttggtgctcg tatagatact attctttcga 

50331 ccatcagttc gaaaaatatg aggaccatat 

50401 acgcccttta tgattggagg aaagaacctt 

50471 catctgtggt ggggattgac cagctttcac 

50541 ccagtacgcc aacat caeca tggacctata 

50611 gtccaagcag ggcgttcggc taaaactgaa 

50681 atggagtagg teaaaatget agcagagtta 

50751 atctgtcgtt aaaaaccgat atggcgaaga 

50821 acctatactc ttataggatt caaagaggaa 

50891 aagcaaaagc ctctaggtcg actgctcgtc 

50961 atgaaagtaa atggtcttca aattgaagcg 

51031 aagacgaagg aacat teat t tttagacgaa 

51101 teatgeagga gggactgaaa agcatccctc 

51171 gtgacggaag ctggaacggt tcactgtttc 

51241 atgtattagg tcgaaacgat ggagggttct 

51311 cgaagtagtt aggcaaggcg tcagccctga 

S1381 aaaatcattc ctgaagagga acttgataaa 

51451 eggacgaget catcgagatg tttgatgtag 

51521 gaacctcaag ggcgaaacag tattcttcaa 

51591 gatgacccta aaaeggaatt tctttatggc 

51661 ctattagtca agtattcgtg actgagtctg 

51731 agtegctett atgggagtag gtggaggaaa 

51801 gttctagcac ttgaccctga taacgctggg 

51871 geaaggtegt tagatttttg aactacccta 

51941 ggaattatta aattttaatg atttagtctt 

52011 tttaaaaaga ggtcatatca atatgaaaga 

52081 tggactgacg aagaatgtat caggaacttt 

52151 gttattttgg gatgetttat tectatgeaa 

52221 tgcattcgag actatttcaa aatgtttggc 

52291 cttacaagac tcttcaagaa tagaatagtc 

52361 attggtatgt agaagtgacg ttcgatagcg 

52431 gacagttggc tattgtgaag actaeggaaa 

52501 aatacagagt atgettatat ctcgtctgtc 

52571 gtgaaattgg agtaagcagg tetgetatta 
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gaacagacct tgaccatcgt ggttctcgat tctgggatga 
ggacaaccta atgcgcggtg acttggcatt ctacactcga 
attcaagaag atgetaaage atttaaacgt gaacatggat 
aacagttcat egctgeaatt gaagaagccg gtgaattaaa 
caaaagtctt cgtgatgctc taaaagagta catgaaagaa 
ttttctgeta ccttctacac gaeagagege tcaactatgg 
aattagttga egaagecgag aeggaagaaa tgtgtgaaaa 
catcaatacg aaacttctcg aggatatgat ttatcaegge 
gttgtcattt ctgttacaga aggcattcgt tttggaaagg 
aegtttttag ggttagcaga atccaatcac accacttgcg 
aggttaatat tataccataa ggaggagata agtggcaagg 
aatgaaattg aactaacatt caaagacaag cctaaaactc 
caggtctttc aaaagtcgag catgattatt ttcaaatagt 
taatatgaag caggtgtcat ctttctttat agttcagtat 
tataactggt tcaacttttc gagcactatg aaaaatgttc 
aactttgtcg atttttagct gaaagttttg ttaaatatga 
aaggttcata aeggtctega ctttcaaaag agectggatt 
aaattcgaag gattttatta gtttagtaga ctatttttag 
teaataggeg tataatttat caatcttgat tettteggge 
ataaattata gtatcgaaaa tataaaaagg agaaaagttg 
aagttaaatg accttttcga gagaagtggg ctacctcaaa 
gaaaagcega caaggaatgt tgggaatggc tagaagctgt 
eggtcttage attgttattg cttcgaatac tgtcgggaat 
caaegctatt tagcagaaac tgeacttgae ggaagaattg 
aactattgac tgagttcggc gactataatt attttcaaac 
ccttaagact tgtgagctat tagtcataga cgaaataggt 
tatctgtatg acttggttaa ttatagggtt gacaataact 
acgatgaaat tattgacctt ttaggccaaa ggctttatag 
ttttcaggca agcaatgtaa gaggattgga ggtaagcgaa 
tatttttctt tggcagattg tctttctttg tatttgetge 
gagegagagt ctcaagataa ggtgattcaa agttataagc 
tcgatagttc aggagcttgg ctaggaagtg ctccgggagc 
acagcatgta ggaaaattga aagaggtggg agagtgatac 
aaaagagctt atccatttta gaaaataatg gaattgacca 
tcaatttatt caagaacact tttcgagata tggaagagtt 
cctggattcg aatttttcga aattggcgaa actgatgaat 
tatataattc acttgttcca attttaaegg aagcggctga 
tgegaatata attccaaaac tagaagaact tttcaatege 
egaaatgeta aacttcgact agactgggcg aatactatta 
cgacagggtt tgaactattg gacgacgtgc ttggaggctt 
ggctcgacct ggacaaggta agtcgtggac tattgataaa 
gatgtccttc tatatagegg ggaaatgagt gaaatgcaag 
atgttagcat caattcaatt accaaaggga tttggaacga 
tcaagcaatg actgaggctg aaaattccct tgtggtagtc 
acccctgcaa ttttagatag catgatatct aaatatagac 
teatgagega gtcttatcca agcagggagc agaagegaat 
taagatttct gctaaatatg gaattcctat tgtgcttaat 
ggcgctgaaa gtatggaact agaacatata gcagaaagtg 
tegctatgaa gcgtgacgaa aaatceggea tacttgaact 
ccgaaaaatc atcgaatata tgtgggacgt tgaaactgga 
ggcgaagaag gaactgaaaa aggegaaage tctccattga 
ttcgaagtaa ggttacaagg gaaggagttg aagcattttg 
actcctgaac aaataattga aaaactttcg agacaacttg 
etaagtcget tggaagcaac tatcaattct catgcccgtt 
ttgtggcatg agtagaaatc cttcttattc aggaagtaag 
acttgegget acacttcagg actaactgaa ttegtctega 
atggaaacca gtggctgaaa aggaattttg gaacatctag 
agegtttega agaaatggga gaactgaaaa agtcgagcat 
taceggttta ttcatcctta tatgtatgaa eggaaattga 
gttatgacaa actgeatgat tgcatcacct ttccagtacg 
ccgtcgaagt gttcgttcta agtttcacca gtacggtgaa 
caatatgagc ttgtagcatt tcgagactat tttgaaaaac 
ttatcaactg cttgactctt tggtcaatga agattccagc 
tcaaatcaat ttactaaaac gacttcctta tagaaatatt 
cagacagcgc aggaaaaact ctaccgacag ttaaagcgaa 
aagagttcta tgataataag tgggatataa acgaccatcc 
gtagaaattc atttattatc gtataataaa gttagaaaat 
agegaataga ctagtttcta gctatgtagg attcgaatgc 
gaactagacc ctgatatgtc aattgegtet gcttatcatc ** 
aaaggtttaa atgettatet cgacatgaca ttgaaagcat 
aacgttcaaa tcaaaccaag gggccaagtt ttcaacttac 
ttagaatata ggtacctaaa tgcaccttcc atgaatcgaa 
tttcgacaaa tgaagaaggc gacgatttta gtatcctatc 
aattgaaatt gaagcaagtc ttgacttcat gaegctttet 
attcaaaacg gtccttcagt aagegacgea gaaattgege 
gtcagtctaa gaagtcacta aaaaataaat taaaagattt 
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52641 tatataactg gtttacaaat cacgtgaatt tcgtgtatat tatatacgaa aggacaaact ttgaaacctt 

52711 aaaaacttca aaaatctctc aaceattaaa aacttataaa ggagaatcga tatgggaaaa gtatcaattc 

52781 aaaaatcagg aacatttagc ccagggtcta acaacgagtt tttcacactc gctgaccacg gtgacagcgc 

52851 aattgtcact ctattgtatg atgacccgga aggcgaagac atggattact tcgtagtcca cgaagcagac 

52921 gttgacggtc gtcgacgcta tatcaactgc aatgctattg gcgaagacgg ggaaacagtc catcctgata 

52991 attgtccatt atgccaaaac ggattccctc gtattgaaaa actatttctt caactttaca accatgatac 

53061 gggaaaagtt gaaacatggg accgaggccg ttcttatgtt caaaagactg ttacatttat caataaatat 

53131 ggaagccttg tgactcagcc ttttgaaatt attcgttcag gagctaaagg tgaccaacga actacttatg 

53201 aattccttcc agagcgtccg gaagacagtg ctactcttga agattttcca gaaaagagcg aacttcttgg 

53271 aactctaatt ttagacctcg acgaagacca aatgtttgac gtggttgacg gcaagttcac tcttcaagaa 

53341 gagcgttctt caagtcgttc aaattcacgt agaggagcac ctcctgcgcc tagacgaggt tccggtcgag 

53411 aatcttcaca aggtcgaaca gctgaaagaa ctccttcagt tagtcgaaga actcctccaa cacgaggtcg 

53481 aggattctaa catgagggcg cgagccctct ttattattga ttaagaaagg gaaaataatg gcacaaaaag 

53551 gactctttgg tgcaaagcct cgttctagca agaagaacga tgctcagtta cttgctcaac ggaaaaacag 

53621 gaagcctgca gttgaggtta ctcacatttc aggaaacgct ctaaaggacg cagttgctag agctcgtact 

53691 ctttcaacta ggattcttgg acacgctctt gatagacttg agttaatcac tgaggaagca aaactcgagc 

53761 agtatgtaga caaaatgatt gaagacggaa taggttctat tgacgtagaa actgacggac tcgatactat 

53831 tcacgatgag ctggcaggag tctgcttgta ctcacctagt caaaaaggaa tctatgctcc tgtcaatcat 

53901 gttagcaata tgacgaagat gcgaattaag aatcaaattt ctcctgagtt catgaagaaa atgcttcaac 

53971 ggattgtaga ttcaggaatt cctgtcatct atcataattc gaaatttgac atgaaaccga tttattggcg 

54041 actcggcgtc aaaatgaatg agccagcgtg ggatacatat ttagccgcaa tgcttttaaa tgaaaacgag 

54111 tctcacagct tgaaaagtct tcactctaaa tatgttagga acgaagaaaa cgcagaggtt gcaaaattta 

54181 atgacttatt taaaggaatt ccttttagtt taattcctcc tgatgttgcc tatatgtatg cggcctatga 

54 251 ccctttgcaa actttcgaac tctatgaatt tcaagaacaa tacttgactc caggaactga acaatgtgaa 

S4321 gaatataacc tggaaaaagt ctcatgggtt cttcataata ttgagatgcc tctaattaaa gttctcttcg 

54391 acatggaagt ctacggtgtc gactcagacc aagataagct ggcagaaatt agagaacagt ttactgccaa 

54461 tatgaacgag gctgagcaag agtttcaaca get tgt cage gaatggcagc ctgaaattga agaacttcga 

54531 caaactaatt cccagagcta tcaaaaactc gaaatggatg caagaggtcg agtgacggca agcatttcca 

54601 gtcetactca attagcaatt ctgttttatg atatcatggg attgaaaagt cctgaaaggg ataaacctag 

54671 aggaacaggc gaaagtattg tcgagcattt tgataacgac atctcaaaag cacttttgaa atatagaaaa 

54741 tatgeaaaat tagtttcgac ctatacaaca cttgaccaac accttgeaaa gectgacaat cgaattcaca 

54811 ctacattcaa acagtaegga gctaagacag ggcgtatgtc aagtgagaat cctaacttac agaatattcc 

54881 ttctcgcggt gagggtgcag tagttcgaca aatctttgea gccagtgaag ggcattacat tattggtagt 

54 951 gactactctc aacaagaacc tcgttcattg gcggaattaa gtggcgacga aagtatgega catgettacg 

55021 aacaaaacct ggacctatat tcagttatcg gttcgaaact ttatggtgtt ccctatgaag agtgtttaga 

55091 gttctatccc gaeggaaega ctaacaagga aggaaaactt cgaagaaatt ctgtcaagtc cgttctttta 

55161 ggtcttatgt acggccgcgg ggctaactca ategctgage agatgaatgt atctgtcaaa gaagegaata 

55231 aggttattga agatttcttc accgagttcc ctaaagtggc agactatatc atattegtte aacagcaggc 

55301 gcaggacttg ggatatgttc aaacagctac eggtcgaaga agaaggcttc ctgatatgag tcttcctgaa 

55371 tacgagttcg agtatatcga cgctagcaag aacgaagatt tcgacccctt taactttgac gcagaccaac 

55441 agatggacga tactgttcct gaacatatta tcgaaaaata ttgggcccag ctagatagag cctggggatt 

55511 taagaagaag caagaaatta aagaccaggc aaaagccgaa ggaattctta ttaaggataa eggaggcaag 

55581 atagctgatg ctcagcgcca atgtttgaac tcagttattc aaggaaegge agecgacatg actaagtacg 

55651 caatgattaa ggtacacaat gaegctgaat tgaaagaatt aggattccat ttaatgattc cagttcacga 

55721 tgagctacta ggtgaggttc ctatcaagaa egcaaaaegg ggagcagaaa ggttgacaga agttatgatt 

55791 gaagcageca aggacattat tagtcttcca atgaaatgtg accccagtat agtagaaaga tggtatggtg 

55861 aagaaattga aatctaaaat ctattcagtt gcatatataa ttctagtagt tattgegaac cttgtgacaa 

55931 tttatttcga acctttaaat gtgaaaggaa ttttaattcc tccaagcagt tggtttatgg gattcacttt 

56001 ectgettata aatctaataa gcaagtacga gaagccaaaa tttgeagget ctttgatatg ggtagggtta 

56071 ttccttacct cgttgatttg etttatgeaa aacctaccac aatcgcttgt cgtggcttca ggagttgcat 

56141 tttggataag tcaaaaagca agtgtcttta tattcgacaa gctctcgaat aaattagact egaagattge 

56211 aaatgctttg tctagcaaca teggttctat tatagacgea accatatgga tttcattagg actgagtcct 

S6281 cttggaattg gaacggttgc atatatagat attcegtcag cegtactagg ccaagttcta gttcagttta 

56351 tettgeagtc aattgetteg agatatttga aaaagtagtc aggaaaattc ctgattaccc tgcagccaat 

56421 tgcttcgaga tatttgaaaa agtagtcagg aaaattcctg attatttttt ttacaaaaac gcttgacttt 

56491 attcattcat tattat 
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Table 29 



Phage dpi ORFs list 



no 


Name 


Frame 


r^osiuon 


Size 
(a.a.) 


Key words 


1 


dp1ORF001 


2 


36698..40390 


1230 


Putative tail; 


2 


dp1ORF002 


1 


32386..35S35 


1149 


Tail; 


3 


dp1ORF003 


3 


53538..5S877 


779 


DNA polymerase I; 


4 


dp1ORF004 


3 


40401. .42440 


679 


Minor structural; 


5 


dp1ORF005 


1 


23674..2S434 


586 




6 


dp1ORF006 


2 


45296..469S7 


563 


SWI/SNF Helicase: 


7 


dplORF007 


3 


22230..23621 


463 


Terminase; 


8 


dp1ORF008 


1 


49624..50961 


445 


DNAb Helicase; 


9 


dplORF009 


2 


13160..14404 


414 




10 


dp1ORF010 


2 


8699..9859 


366 


RecA; 


11 


dp1ORF011 


3 


28017..29096 


359 


Major head; 


12 


dp1ORF012 


3 


5346..6419 


357 


DNA pot. Ill beta; 


13 


dp1ORF013 


3 


10215.. 11240 


341 


DNA pol. Ill gamma and tau; 


14 


dp1ORF014 


3 


50961..51974 


337 


DNA primase; 


15 


dp1ORF015 


1 


3793..4728 


311 




16 


dp1ORF016 


3 


43413..44303 


296 


Amldase: 


17 


dp1ORF017 


1 


11242..12081 


279 




18 


dp!ORF018 


3 


3S847..36686 


279 




19 


dp1ORF019 


2 


12161..12967 


268 




20 


dp1ORF020 


1 


1864..2658 


264 


exsD; Coenzyme PQQ; 


21 


dplORF021 


2 


2504..3295 


263 


GTP cyclohydrolase; 


22 


dp1ORF022 


2 


30896..31675 


259 




23 


dp1ORF023 


2 


6419..7195 


258 




24 


dp!ORF025 


-1 


18026..18778 


250 




25 


dp1ORF024 


3 


25992.-26738 


248 




26 


dp1ORF026 


2 


21512..22252 


246 




27 


dp1ORF027 


1 


52762..53490 


242 




28 


dp1ORF028 


3 


44595..45299 


234 




29 


dp1ORF029 


2 


662..1348 


228 


exsB; 


30 


dp1ORF031 


3 


26943..27611 


222 




31 


dplORF030 


-2 


19423..20088 


221 




32 


dp1ORF032 


1 


52033..52647 


204 




33 


dplORF033 


2 


7670..8239 


189 




34 


dp1ORF035 


-1 


16859..17425 


188 




35 


dp1ORF036 


1 


48808..49362 


184 


DNAc replication; 


36 


dp1ORF037 


1 


S5855..56388 


177 




37 


dp1ORF034 


2 


131 ..652 


173 




38 


dp1ORF038 


3 


1350..1871 


173 


exsC; 6-pyruvoyltetrahydropterin; 


39 


dp1ORF039 


3 


3306..3803 


165 


Citailline biosynthesis; 


40 


dp1ORF040 


1 


7192..7683. 


163 




41 


dp1ORF041 


3 


8208..8699 


163 


dUTPase; 


42 


dp1ORF042 


1 


48082..48561 


159 




43 


dp1ORF043 


1 


31699..32154 


151 




44 


dp1ORF044 


-1 


25211. .25666 


151 




45 


dp1ORF045 


2 


25340..25777 


145 




46 


dp1ORF046 


3 


42774..43202 


142 




47 


dp1ORF047 


1 


47542..47961 


139 




48 


dplORF048 


-3 


16308..16709 


133 




49 


dp1ORF049 


-3 


43620..44018 


132 




50 


dp1ORF050 


3 


15081..15476 


131 




51 


dp1ORF051 


2 


29765..30154 


129 




52 


dp1ORF053 


-3 


49917..50300 


127 




53 


dp1ORF052 


3 


30516..30893 


125 




54 


dp1ORF054 


2 


14423..14800 


125 




55 


dp!ORF055 


3 


27627..28004 


125 




56 


dp1ORF056 


-3 


18780..19151 


123 




57 


dp1ORF057 


1 


9859..10218 


119 




58 


dp1ORF058 


3 


15633..15989 


118 




59 


dp1ORF059 


1 


30154..30507 


117 




60 


dp1ORF060 


-2 


37717..38070 


117 




61 


dp1ORF062 


-3 


44940..45284 


114 




62 


dp1ORF063 


1 


47200..47541 


113 




63 


dp1ORF064 


2 


29108..29449 


113 
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64 


dplORF066 


-3 


2B566..28898 


110 




65 


dp1ORF067 


-1 


44735..45061 


108 




66 


dp1ORF068 


3 


29451 ..29768 


105 




67 


dp1ORF069 


-3 


20094..20411 


105 




68 


dp1ORF061 


-3 


19161..19475 


104 




69 


dp1ORF070 


1 


15973..16284 


103 




70 


dp1ORF071 


3 


38904. .39209 


101 




71 


dp1ORF072 


-2 


50749..51045 


98 




72 


dp1ORF073 


3 


14262..14555 


97 




73 


dp1ORF074 


3 


32298..32591 


97 




74 


dp1ORF075 


-1 


22154..22447 


97 




75 


dp1ORF076 


-1 


S435..5728 


97 




76 


dp1ORF077 


1 


14800.. 15084 


94 




77 


dplORF079 


•3 


35007.. 35288 


93 




78 


dp1ORF081 


-3 


55188..55466 


92 




79 


dp1ORF103 


2 


493S2..49627 


91 




80 


dp1ORF080 


1 


42490..42759 


89 




81 


dp1ORF082 


1 


4472S..44994 


88 




82 


dp1ORF083 


-1 


35720.-35974 


84 




83 


dp1ORF065 


-3 


51246..51497 


83 




84 


dp1ORF085 


-3 


10602.. 10847 


81 




85 


dp1ORF087 


-2 


29794..30036 


80 




86 


dp1ORF088 


3 


5040..5279 


79 




87 


dp1ORF089 


-2 


12256.. 12495 


79 




88 


dp10RF273 


3 


56256.. 56486 


76 




89 


dp1ORF078 


-3 


17280.. 17507 


75 




90 


dp1ORF090 


1 


27037.. 27261 


74 




91 


dp1ORF091 


1 


43189..43413 


74 


Holin; 


92 


dp1ORF092 


3 


46989..47213 


74 




93 


dp1ORF093 


-2 


45538..457S6 


72 




94 


dp1ORF095 


3 


8877.. 9089 


70 




95 


dp1ORF096 


-1 


46469. .46681 


70 




96 


dp1ORF097 


-1 


38888. .391 00 


70 




97 


dp1ORF098 


1 


43627. .43836 


69 




98 


dp1ORF099 


3 


38298. .38507 


69 




99 


dp1ORF100 


1 


1597.. 1803 


68 




100 


dp1ORF101 


2 


19220.. 19426 


68 




101 


dp1ORF094 


1 


8281 ..8484 


67 




102 


dp1ORF102 


2 


4034..4237 


67 




103 


dn1ORF104 


.1 


2 1224.. 2 1427 


67 




104 


dDlORF105 


_2 


1828 2028 


66 




105 


dp1ORF106 


-3 


10329.. 10529 


66 




106 


dp1ORF108 


-1 


49250..49447 


65 




107 


do1ORF109 

U^/ 1 Will * w w/ 


.2 


31435.. 31632 


65 




108 


dp1ORF110 


1 


16444.. 16638 


64 




109 


dp10RF111 




28657.. 28851 


64 




110 


dp10RF113 


_2 


17521 ..1771 5 


64 




111 


dplORF084 


1 


15445..15636 


63 




112 


dp10RF114 


2 


52952..53143 


63 




113 


dp10RF115 


-3 


51 51 ..5342 


63 




114 


dp10RF116 


-1 


20474..20662 


62 




115 


dplORF117 


-3 


244 92.. 24680 


62 




116 


dp10RF118 


2 


15023..15208 


61 




117 


dp10RF119 


2 


41054..41239 


61 




118 


dp1ORF120 


1 


28387. .28569 


60 




119 


dp10RF121 


3 


39222.. 39404 


60 




120 


dolORF122 


_1 


40220..40402 


60 




121 


dp10RF123 


_2 


21145..21327 


60 




122 


dplORF124 


-3 


17712.. 17891 


59 




123 


dp10RF125 


-3 


49740..49916 


58 




124 


dplORF126 


-3 


15960..16136 


58 




125 


dp10RF127 


-3 


13335..13511 


58 




126 


dp10RF128 


1 


4852..5025 


57 




127 


dp10RF129 


2 


25133..25306 


57 




128 


dp1ORF130 


-1 


16619..16789 


56 




129 


dp10RF131 


1 


43846..44013 


55 




130 


dp10RF132 


-1 


15137..15304 


55 




131 


dp10RF133 


-2 


7900..8061 


53 




132 


dp10RF135 


3 


780..938 


52 




133 


dp10RF136 


-1 


55094..55252 


52 




134 


dp10RF137 


-2 


36988..37146 


52 
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135 


dplORF138 


-3 


30504..30662 


52 




136 


dp10RF139 


-3 


11 934.. 12092 


52 




137 


dp1ORF140 


3 


20562..20717 


51 




138 


dp10RF141 


-1 


42767.-42922 


51 




139 


dp10RF142 


-3 


31743..31898 


51 




140 


dp!ORF143 


-3 


7410..7565 


51 




141 


dplORF144 


1 


36517..36669 


50 




142 


dp10RF145 


1 


42067..42219 


50 




143 


dplORF146 


1 


51484..51636 


50 




144 


dp10RF147 


1 


55207..55359 


50 




145 


dp10RFl48 


■1 


28484..28B36 


50 




146 


dp1ORF150 


-3 


15033..15185 


50 




147 


dplORF134 


-2 


349..498 


49 




148 


dp10RF151 


1 


28027..28176 


49 




149 


dplORF152 


1 


42235.42384 


49 




150 


dp10RF153 


2 


22307..22456 


49 




151 


dp1ORF086 


2 


52760..52906 


48 




152 


dp10RF154 


2 


18446.. 18592 


48 




153 


dp10RF155 


3 


1351 2.. 13658 


48 




154 


dp10RF156 


3 


18777..18923 


48 




155 


dp10RF157 


-2 


13135..13281 


48 




156 


dp10RF158 


-3 


40581. .40727 


48 




157 


dp10RF159 


-3 


30225..30371 


48 




158 


dp10RF149 


-3 


26331. .26474 


47 




159 


dp1ORF160 


2 


41324..41467 


47 




160 


dp10RF161 


2 


52175..52318 


47 




161 


dp10RF162 


3 


13020-13163 


47 




162 


dp10RF163 


3 


40224..40367 


47 




163 


dplORF164 


-2 


65S3..6696 


47 




164 


dp10RF165 


-3 


50361 ..50504 


47 




165 


dp10RF166 


-3 


23376..23519 


47 




166 


dp10RF167 


3 


1008..1148 


46 




167 


dp10RF168 


-2 


54205..54345 


46 




168 


dp10RF169 


-2 


45814..45954 


46 




169 


dp1ORF170 


-2 


27460-27600 


46 




170 


dp10RF171 


-3 


4753B..47678 


46 




171 


dp10RF172 


-1 


10325.. 10462 


45 




172 


dp10RF173 


-2 


32023..32160 


45 




173 


dp10RF174 


-2 


29629..29766 


45 




174 


dp10RF175 


-2 


15511..15648 


45 




175 


dp10RF176 


-3 


42894..43031 


45 




176 


dp10RF177 


-3 


19800..19937 


45 




177 


dp10RF178 


-3 


11787..11924 


45 




178 


dp10RF112 


2 


32207..32341 


44 




179 


dp10RF179 


3 


56058..56192 


44 




180 


dp1ORF180 


-1 


41 042.. 4 11 76 


44 




181 


dp10RF181 


-1 


12992..13126 


44 




182 


dp10RF182 


-2 


45235-45369 


44 




183 


dp10RF183 


-2 


13762..13896 


44 




184 


dp10RF184 


-3 


53196..53330 


44 




185 


dp10RF185 


1 


22522.. 22 653 


43 




186 


dp10RF186 


2 


21272..21403 


43 




187 


dp10RF187 


2 


34415..34546 


43 




188 


dp10RF188 


2 


35609..35740 


43 




189 


dp10RF189 


2 


42587..42718 


43 




190 


dp1ORF190 


3 


39786..39917 


43 




191 


dp10RF191 


-1 


40865..40996 


43 




192 


dp10RF192 


-1 


2789..2920 


43 




193 


dp10RF193 


-2 


42325..424S6 


43 




194 


dp10RF194 


-2 


40153..40284 


43 




195 


dplORF195 


-3 


42453.-42584 


43 




196 


dp10RF196 


-3 


11142..11273 


43 




197 


dp1ORF107 


1 


10750.. 10878 


42 




198 


dp10RF197 


2 


7484..7612 


42 




199 


dplORF198 


2 


2411 9„24247 


42 




200 


dp1GRF199 


-1 


15614..15742 


42 




201 


dplORF200 


-3 


4771 5..47S43 


42 




202 


dp1ORF201 


1 


38569-38694 


41 




203 


dp1ORF202 


2 


44483-44608 


41 




204 


dp1ORF203 


-3 


22656..22781 


41 




205 


dp1ORF204 


1 


1471-1593 


40 
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206 


(JP1ORF205 


1 


8S24..8646 


40 




207 


dp1ORF206 


1 


19855..19977 


40 




208 


dp1ORF207 


1 


27502..27624 


40 




209 


dp1ORF208 


2 


47279..47401 


40 




210 


dp1ORF209 


3 


297B4..29906 


40 




211 


dp1ORF210 


-1 


52955..53077 


40 




212 


dp10RF211 


-1 


20837..20959 


40 




213 


dp10RF212 


-2 


52861. .52983 


40 




214 


dp10RF213 


-2 


30169..30291 


40 




215 


dp10RF214 


-2 


241 51. .24273 


40 




216 


dp10RF215 


-3 


35700..35822 


40 




217 


dp10RF216 


-3 


32727..32S49 


40 




218 


dp10RF217 


1 


23443..23S62 


39 




219 


dp10RF218 


3 


22029..22148 


39 




220 


dp10RF219 


-1 


51269..51388 


39 




221 


dp1ORF220 


-1 


6215..6334 


39 




222 


dplORF221 


1 


43507..43623 


38 




223 


dp10RF222 


3 


13212..13328 


38 




224 


dp10RF223 


3 


14055..14171 


38 




225 


dp10RF224 


-1 


13505..13621 


38 




226 


dp10RF225 


-2 


32875..32991 


38 




227 


dp10RF226 


-2 


25075..25191 


38 




228 


dplORF227 


-2 


22999..23115 


38 




229 


dp10RF228 


1 


10450..10563 


37 




230 


dp10RF229 


1 


27634..27747 


37 




231 


dp1ORF230 


2 


50723..50836 


37 




232 


dp!ORF231 


-2 


30958..31071 


37 




233 


dp10RF232 


-2 


29272..29385 


37 




234 


dp10RF233 


-3 


S2779..52892 


37 




235 


dp10RF234 


1 


362S3..36363 


36 




236 


dp10RF235 


2 


32768..32B78 


36 




237 


dp10RF236 


-1 


37418..3752B 


36 




238 


dp10RF237 


-1 


1568..1678 


36 




239 


dp10RF238 


-3 


1191 ..1301 


36 




240 


dp10RF239 


1 


26521 ..26628 


35 




241 


dp1ORF240 


1 


41893..42000 


35 




242 


dp10RF241 


-1 


46913..47020 


35 




243 


dp10RF242 


-1 


41231 ..41338 


35 




244 


dp10RF243 


-2 


51199..51306 


35 




245 


dp10RF244 


-3 


26976..27083 


35 




246 


dp10RF245 


-3 


61 71. .6278 


• 35 




247 


dp10RF246 


-3 


2724..2831 


35 




248 


dplORF247 


1 


29641. .29745 


34 




249 


dp10RF248 


1 


53560..53664 


34 




250 


dp10RF249 


2 


2012..2116 


34 




251 


dp1ORF250 


2 


23837..23941 


34 




252 


dplORF251 


-1 


39101 ..39205 


34 




253 


dp10RF252 


-2 


54667..54771 


34 




254 


dp10RF253 


-3 


561 51 ..56255 


34 




255 


dp10RF254 


-3 


4S375..48479 


34 




256 


dplORF255 


-3 


9468..9S72 


34 




257 


dp10RF256 


1 


15289..15390 


33 




258 


dp10RF257 


1 


28216..28317 


33 




259 


dp10RF258 


1 


44023..44124 


33 




260 


dp10RF259 


2 


4298..4399 


33 




261 


dp1ORF260 


2 


24746..24847 


33 




262 


dp10RF261 


3 


288..3S9 


33 




263 


dp10RF262 


3 


9408..9509 


33 




264 


dp10RF263 


-1 


26951..27052 


33 




265 


dp10RF264 


-1 


6036..6139 


33 




266 


dp10RF265 


-1 


4700..4801 


33 




267 


dp10RF266 


-2 


50119..50220 


33 




268 


dplORF267 


-2 


47266.-47367 


33 




269 


dplORF268 


-2 


12520..12621 


33 




270 


dp10RF269 


-3 


S3733..53834 


33 




271 


dplORF270 


-3 


50691. .50792 


33 




272 


dp10RF271 


-3 


19638..19739 


33 




273 


dp10RF272 


-3 


1455..1556 


33 
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Table 30 

Predicted Dp-1 amino acid sequences 

dplOR?001 

36698 atgattgacaataatttacctatgagtccaattcctggcgaaattgttcaagtatatgaccaaaacttcaatctaattggagca 

1 MIDNNLPMSPI PGEIVQVYDQNFNLIGA 

36782 agtgatgaaatctttagcaagcattacgaagacgaaattgtgactcgagctcgaggaaaagaaactttcacttttgaaagtatt 

29 SDEIFSKHYEDEIVTRARGKETFTFESI 

36866 gaaacctcatctatctatcaacacttaaaggttgaaaacattatccagtatggaggaagacggtttcgaattaaatatgctcag 

57 ETSSIYQHLKVENI IQYGGRWFRIKYAQ 

36950 gacgtagaagatgtcaaagggcttaccaagtttacctgctacgcattatggtatgaactagcagaaggcttgcctaggaagttg 

85 DVEDVKGLTKFTCYALWYELAEGLPRKL 

37034 aaacacgttgcttcttctgtaggcgctgtcgcgctagatattatcaaagacgcaggtgaatgggctcgactagtttgtcctcct 

113 KHVASSVGAVALD I IKDAGEWVRLVCPP 

37118 gacggtgctaacaaacaagttcgaagcataacagccgcagaaaattcaatgctttggcatcttcgatatcttgcaaagcaatac 

141 DGANKQVRS ITAAENSMLWHLRYLAKQY 

37202 aatttagaactgacatttggttatgaagaaattaccaagcaagaggttagaattgttcaaaccgttgtatttcttcagccttat 

169 NLELTFGYEEI IKQEVRIVQTVVFLQPY 

37286 gtcgagtctaaagtagactttcctcttgtagttgaagagaatttgaaacatgtcactaggcaggaagattctcgaaacctgtgt 

197 VESKVDFPLVVEENLKYVTRQEDSRNLC 

37370 acggcttacaagttgacaggtaaaaaggaagaaggcagtcaagagcctttaacgtttgcttctatcaacaatggaagtgaatat 

225 TAYKLTGKKEEGSQEPLTFAS INN GSEY 

37454 ctcattgatgtttcgtggtttactacacgccacatgaagcctcgatatattgctaaatctaaaagcgacgaacattttagaatt 

253 LIDVSWFTTRHMKPRYIAKSKSDEHFRI 

37538 aaagaaaatttgatgagtgctgcgcgtgcttatcttgacatctacagtcgcccactaattggatatgaggcttcagcggtccCt 

281 KENLMSAARAYLDIYSRPLIGYEASAVL 

37622 tataacaaggttcctgacttgcaccatactcaactaattgtcgacgaccattatgatgttatcgagtggcgaaagatatctgcc 

309 YNKV PDLHHTQLI VDDHYDVI EWRKI SA 

37706 cgaaaaattgactacgacgacctttcaaactctactatcattttccaagacccecgaaaagacttgatggacttgctaaatgag 

337 RKIDYDDLSNSTI IFQDPRKDLMDLLNE 

37790 gacggcgaaggagtcctttcaggggaaactgtaaatgagtcccaagttgttattagatacgcagatgacattttagggactaat 

365 DGEGVLSGETVNESQVVIRYADDILGTN 

37874 tttaatgcagaatctgggaaacacattggtgtccttaatactaataagaaaccgagcgaattagttcctgacgactttacatgg 

393 FNAESGKYIGVLNTNKKPSELVPDDFT W 

37958 attcgactagaaggtcctaaaggtgacgcaggtttaccgggagctcctgggcgtgatggagtcgacggtgtacctggaaagagc 

421 IRLEGPKGDAGLPGAPGRDGVDGVPGKS 

38042 ggagtagggatagcagatacagctatcacttatgctgtatccgtttccggaacgcaagagcctgaaaatggatggagcgaacaa 

449 GVGIADTAITYAVSVSGTQEPENGWSEQ 

38126 gttcctgaactcataaaaggtcgattcttgtggactaaaacattttggagatatactgacggctcacatgaaactggatactcc 

477 VPELIKGRFLWTKTFWRYTDGSHETGYS 

38210 gttgcctatatagggcaagacggaaattccggaaaagacggaatcgcaggtaaggacggagtaggtatagccgcaactgaagtc 

505 VAYIGQDGNSGKDGIAGKDGVGIAATEV 

38294 atgtatgcaagttcgccatctgctactgaagcCccagctggtggatggtctacgcaagttcctaccgtcccaggtggtcagtat 

533 MYASSPSATEAPAGGWSTQVPTVPGGQY 

38378 ttatggactcgaacaagatggcgctacactgaccaaactgatgaaattggatattcagtttcaagaatgggcgagcagggtcct 

561 LWTRTRWRYTDQTDEIGYSVSRMGEQGP 

38462 aaaggtgacgcaggtcgtgacggtattgcaggaaagaacggaatagggttgaagtcaacttcagtttcttatggaattagtccc 

589 KGDAGRDGIA. GKNGIGLKSTSVSYGISP 

38546 actgattctgcgattcctggagtatgggcttcacaagttccttctttaatcaaaggtcaatatctttggactcgaactatttgg 

617 TDSAIPGVWASQVPSLI KGQYLWTRTIW 

38630 acctataccgattcaactaccgaaacgggctatcaaaaaacctacattccaaaagacgggaatgacggtaaaaatggaattgct 

645 TYTDSTTETGYQKTYIPKDGNDGKNGIA 

38714 ggtaaggatggggtaggaattaagtccacgaccattacctacgcaggctcaacctcaggaacagttgcgcctacttcaaattgg 

673 GKDGVGIKSTTITYAGSTSGTVAPTSNW 

38798 acttctgctattccaaatgttcaaccgggattcctcttgtggacgaaaactgctcggaactatactgatgacactagcgaaaca 

701 TSAI PNVQPGFFLWTKTVWNYTDDTSET 

38882 ggttactcagtttccaagataggtgaaacaggtcctagaggagttcaaggtcttcaaggtcctcaagggcttcaaggaattcct 

729 GYSVSKIGETGPRGVQGLQGPQGLQGIP 

38966 ggacctgcaggagctgacggacgttcgcaatatactcacctcgctttctctaatagtccaaacggtgagggatttagtcatact 

7S7 GPAGADGRSQYTHLAFSNSPNGEGFSHT 

39050 gacagcggacgagcatacgtcggtcagtatcaagatttcaatcccgtccattcaaaagaccctgcagcctatacatggacgaaa 

785 DSGRAYVGQYQDFNPVHSKDPAAYTWTK 

39134 tggaaggggaatgacggagctcaagggatacccgggaagccaggcgcagacggtaagactaattatctccatatagcttacgct 

813 WKGNDGAQGI PGKPGADGKTNYFHIAYA 

39218 tcaagtgcagacggatcacgtgagttcagtttggaagataataatcaacaatatacgggttattactccgattatgagsaa^ca 

841 SSADGSREFSLEDNNQQYMGYYSDY ~E" Q A 

39302 gatagcagggatcgaactaagtatcgatggtttgaccgccttgccaatgttcaagtgggaggtcgaaacgagttccttaattct 

869 DSRDRTKYRWFDRLANVQVGGRNEFLNS 

39386 ttatttgaatttggtttaaaacctcgctattctagttacaatctaatggacggacaagatcaaacgcaaggacagatatctgct 

897 LFEFGLKPRYSSYNLMDGQDQTQGQISA 

39470 actattgacgaacgtcaacggttcaaaggtgctaactctttacgacttgactcaacatggaacggtaaaccgcagaaccaaaaa 

925 TIDERQRFKGANSLRLDSTWNGKPQNQK 
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39554 ctgaccttttctttaggaggagatacgcgattaggtactccaaccgagtggtctaatttagaaggtcgtatcagtttctgggct 

953 LTFSLGGDTRLGTPTEWSNLEGRISFWA 

39638 aaggcctctaggaacggagtgagcttagctgcacggccgggttatcgtagtaacgtatttaccgcaaccttaaccgatcaatgg 

981 KASRNGVSLAARPGYRSNVFTATLTDQW 

39722 aagttctacgattttaaattctttgacaaagtcaattcaaactgtaccgctgaagcaattttccatgtattcactcaaagttgt 

1009 KFYDFKFFDKVNSNCTAEAI FHVFTQSC 

39806 tcagtgtggctcaatcatattaaaatcgaacttggtaatatctctactccttttagtgaagcagaggaagaccttaaatatcga 

1037 SVWLKHIKIELGNI STPFSEAEEDLKYR 

39890 attgactcaaaagccgatcaaaagctaactaaccaacagttgacggcactcacggaaaaggctcaactacatgacgcagaactg 

1065 IDSKADQKLTNQQLTALTE KAQLHDAEL 

39974 aaagctaaggctacaatggagcagttaagtaacttagaaaaggcttatgaaggtagaatgaaagctaacgaagaagctatcaaa 

1093 KAKATMEQLSNLEKAYEGRMKANEEAIK 

4 0058 aaatcggaagccgacctaatcttagcggeaagtcgaattgaagctactatccaagaacttggcgggctacgggaactgaagaag 

1121 KSEADLILAASRIEATIQELGGLRELKK 

40142 ttcgtcgacagttacatgagctcttctaatgaaggtctaattatcggtaagaacgacggtagctctaccattaaggtatcaagt 

1149 FVDSYMSSSNEGLI IGKNDGSSTIKVSS 

40226 gaccgaatctccatgttctccgcagggaatgaagttatgcaccttacgcaagggttcactcacatcgataacgggatctttacc 

1177 DRISMFSAGNEVMYLTQGF IHIDNGIFT 

40310 caatccattcaagtcggccgatttagaacggaacaatactcgtttaatccagacatgaacgtgattcggtatgtaggataa 
40390 

1205 QSIQVGRFRTEQYSFNPDMNVIRYVG* 
dplORF002 

32386 atggattttgggtcaattgcagcaaaaatgactttggatatctcaaacttcacaagtcaattaaatcttgctcaaagtcaagcg 

1 MDFGS IAAKMTLDI SNFTSQIiNLAQSQA 

32470 caacggctcgcactagagtcttcgaagtcctttcaaattggttctgctctaacaggattagggaaaggacttacgactgcggtt 

29 QRLALESSKSFQIGSALTGLGKGL TTAV 

32554 acccttcctcttatgggatttgcagccgcctctattaaagtagggaatgaattccaagcccaaatgtcccgtgttcaagctatt 

57 TLPLMGFAAASI KVGNEFQAQMSRVQAI 

32638 gcaggagcgacagcggaagagcttggtagaatgaagactcaagcaatcgaccttggtgctaaaactgcttttagtgcaaaagag 

85 AGATAEELGRMKTQAIDLGAKTAFSAKE 

32722 gcggctcaaggtatggaaaatctagcttcagccggtttccaggtaaatgaaatcatggacgctatgccaggggtacttgacctg 

113 AAQGMENLASAGFQVNE I MDAMPGVL'DL 

32806 gctgccgtacctggaggagatgtggccgcgagctccgaggccatggctagttcacttcgagcctttggattagaggcaaaccag 

141 AAVSGGDVAASSEAMASSLRAFGLEANQ 

32890 gcgggtcacgtggctgacgtatttgctcgagcagcagccgatacgaacgcagaaactagcgacatggcagaggcgatgaaatac 

169 AGHVADVFARAAADTNAETSDMAEAMKY 

32974 gtcgcacccgttgctcactctatgggcttgagccttgaagaaacggctgcgtctattgggattatggccgacgccggtattaag 

197 VAPVAHSMGLSLEETAAS IG IMADAGI K 

33058 ggctcgcaagccggaaccacgcttagaggcgctctctcgcgcattgccaaacctacgaaagcgatggtcaaatcaatgcaggaa 

225 GSQAGTTLRGALSRIAKPTKAMVKSMQE 

33142 ttaggagtttcgttctacgacgcgaacggaaacatgattccactaagagaacaaatcgctcaactgaaaacagctactgcagga 

253 LGVSFYDANGNMIPLREQIAQLKTATAG 

33226 ctaacacaagaggaacgaaatcgtcaccttgttaccttgtatggccaaaactcgttgtcaggtatgcttgcactattagacgca 

281 LTQEERNRHLVTLYGQNSLSGMLALLDA 

33310 ggtcctgagaaattggataagatgaccaatgctctcgtgaactcggacggagctgctaaggaaatggcagaaactatgcaggac 

309 GPEKLDKMTNALVNSDGAAKEMAETMQD 

33394 aaccttgctagtaaaatcgagcaaatgggaggagctttcgagtctgttgctattattgttcaacaaatccttgagcctgcactt 

337 NLASKIEQMGGAFESVAI IVQQI LEPAL 

33478 gctaaaatcgtgggagcaatcacaaaagttctcgaagcattcgtaaatatgtcacctatcggtcaaaagatggttgtcatattc 

365 AKIVGAITKVLEAFVNMS P I GQKMVVIF 

33562 gcaggaatggttgcagcccttggaccactgcttctaattgcaggaatggtgatgacaactattgtcaagttaagaattgctatt 

393 AGMVAALGPL-irLIAGMVMTT I VKLR1AI 

33646 cagtttttaggtccagcatttatgggaacgatgggaaccattgcaggagttatagcaatattctatgctctggtcgccgtgttc 

421 QFLGPAFMGTMGTIAGVIAI FYALVAVF 

33730 <atgatagcctacacaaaatcggagagatttagaaactttatcaacagtcttgcgcctgctattaaagctgggtttggaggagcg 

449 MIAYTKSERFRNFINSLAPAI KAGFGGA 

33814 ttggaatggctacttccacgactgaaagagttaggagaatggttacagaaggcaggcgagaaggcgaaagagttcggtcagtct 

477 LEWLLPRLKELGEWLQKAGEKAKEFGQS 

33898 gtagggtctaaagcgtcaaaactgctcgaacagtttggaacaagtatcggtcaggcaggaggctcgattggtcagttcattgga 

505 VGSKVSKLLEQFGISIGQAGGS IGQFIG 

33982 aatgttctcgaaaggctaggaggcgcatttggaaaagtaggaggagtcatttcaattgctgtttcacttgtaacaaaattcggt 

533 NVLERLGGAFGKVGGVI S 1AVS LVTKFG 

34066 ctcgcatttctagggattacaggaccactcgggattgctattagtctgttagtttcatttttgacagcttgggccagaacaggt 

561 LAFLGITGPLGIAISLLVSFLTAWARTG 

34150 gagttcaacgcagacggaattactcaagtattcgaaaacttgacaaacacaattcagtcgacggctgatttcatctctcaatac 

589 EFNADGITQVFENLTNTIQSTADFISQY 

34234 cttccagtctttgtcgaaaaaggaactcaaattttagttaagattattgaaggaattgcatctgctgttcctcaagtagttgaa 

617 LPVFVEKGTQILVKI IEGIASAVPQVVE 

34318 gtgatttcacaagtcattgaaaatattgtgatgacaatctcgacagttatgcctcaattagtcgaagcaggaattaagaj:acct: 

645 VISQVIENIVMTISTVMPQLVEAGI —RT I L 

34402 gaagcgcttataaatggtcttgttcaatctcttcctactatcattcaagcagctgctcaaattatcactgctttattcaatggt 

673 EAL ZNGLVQSLPTI I Q A A V Q I ITALFNG 

34486 cttgtccaggcactccctacgcttattcaagcaggtcttcaaattttgtcagctctcataaacggactagttcaagcgcttccg 

701 LVQALtPTLIQAGLQI LSALINGLVQALP 

34570 gcaattattcaagcagctgttcaaattatcatgtcgcttgttcaagcactaattgaaaacttgcceatgataatcgaagcagcg 

729 AIIQAAVQIIMSLVQALIENLPMIIEAA 
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34654 atgcagattataatgggtctagtcaacgcactgattgaaaatataggacctatcttagaagcagggattcaaattctaatggct 

757 MQIIMGLVNALIENIGPILEAGIQILMA 

34738 ttaatcgagggacttattcaagtgcttcctgaactaattacagcagcgattcaaatcattacttcactattagaagcaatcttg 

785 LIEGLIQVLPELITAAIQIITSLLEAIL 

34822 tcgaaccttcctcaacttccagaagccggagctaaactgcttttatcacttcttcaagggttgccaaatatgctccctcaacta 

813 SNLPQLLEAGVKLLLSLLQG LLNMLPQL 

34906 attgcaggggctttgcaaatcatgatggcacttcttaaagcagttatcgacttcgtccctaaacttcctcaagcaggtgttcaa 

841 IAGALQIMMALLKAVIDFVPKLLQAGVQ 

34990 cttcttaaggcattgattcaaggtattgcttcacttctcggctcacttttatcgacagctggaaacatgctttcatcatcagtt 

869 LLKALIQGIASLLGSLLSTAGNMLSSLV 

35074 agcaagattgctagctttgcgggacagatggtttcaggaggtgcgaacctgattcgaaacctcattagtggtattgggtcaatg 

897 SKIASFVGQMVSGGANLIRNFI SGIGSM 

35158 attggcccagctgtctctaaaattggcagcatgggaacttcaattgcttctaaggttactggattcgctggacaaatggtaagc 

925 IGSAVSKIGSMGTSIVSKVTGFAGQMVS 

35242 gcaggggtcaaccttgttcgaggatttatcaatggtatcagttccatggtaagctctgcggtaagtgcggcggctaatatggct 

953 AGVNLVRGFINGISSMVSSAVSAAANMA 

35326 agcagtgcattaaatgccgttaagggattcttaggtattcactctccttcacgtgtcatggagcagatgggtatccatacgggt 

981 SSALNAVKGFLGIHSPSRVMEQMGIYTG 

35410 caagggttcgtaaatggtattggtaacatgattcgaactacacgtgacaaggctaaagaaatggctgaaactgttactgaagct 

1009 QGFV NGIGNMIRTTRDKAKEMAETVTEA 

35494 ctcagcgacgtgaagatggatattcaagaaaatggagttatagaaaaggttaaatcagtttacgaaaagatggctgaccaactt 

1037 LSDVKMDIQENGVI EKVKSVYEKMADQL 

35578 cctgaaactcttccagctcctgatttcgaagatgttcgtaaagcagccggttcgccccgagtggacttgttcaatacaggaagt 

1065 PETLPAPDFEDVRKAAGS PRVDLFNTGS 

35662 gacaaccctaaccaacctcagtcacaatccaaaaacaatcaaggcgagcaaaccgttgtcaacattggaacaatcgtagttcga 

1093 DN PNQPQS QS KNNQGEQTVVN I GT IVVR 

35746 aacaatgacgacgttgacaaactgtcgagaggattgtataatagaagtaaagaaactctatcagggtttggtaacattgtaaca 

1121 NNDDVDKLSRGLYNRSKETLSGFGNIVT 

3S830 ccgtaa 35835 

1149 P * 
dplORF003 

53538 atggcacaaaaaggactctttggtgcaaagcctcgttctagcaagaagaacgatgctcagttacttgctcaacggaaaaacagg 

1 MAQKGLFGAKPRSSKKNDAQLLAQRKNR 

53622 aagcctgcagttgaggttacttacatttcaggaaacgctctaaaggacgcagttgctagagctcgtactctttcaactaggatt 

29 KPAVEVTYISGNALKDAVARARTLSTRI 

53706 cttggacacgttcttgatagacttgagttaatcactgaggaagcaaaactcgagcagtatgtagacaaaatgattgaagacgga 

57 LGHVLDRLELITEEAKLEQYVDKMIED G 

53790 ataggttctattgacgtagaaactgatggactcgatactattcacgatgagctggcaggagtccgcttgtactcacctagtcaa 

85 IGSIDVETDGLDTIHDELAGVCLYSPSQ 

53874 aaaggaatctatgctcctgtcaatcatgttagcaatatgacgaagatgcgaattaagaatcaaatttctcctgagttcatgaag 

113 KGIYAPVNHVSNMTKMRI KNQISPEFMK 

53958 aaaatgcttcaacggattgtagattcaggaattcctgtcatctatcataattcgaaatttgacatgaaatcgatttattggcga 

141 KMLQRIVDSGI PVI YHNSKFDMKSIYWR 

54042 ctcggcgtcaaaatgaatgagccagcgtgggatacatatttagccgcaatgcttttaaatgaaaacgagtctcacagcttgaaa 

169 LGVKMNB PAWDTYLAAMLLNENESHSLK 

54126 agtcttcactctaaatatgttaggaacgaagaaaacgcagaggttgcaaaatttaatgacttatttaaaggaattccttttagt 

197 SLHSKYVRNEENAEVAKFNDLFKGIPFS 

54210 ttaattcctcctgatgttgcctatatgtatgcggcctatgaccctttgcaaactttcgaactctatgaatttcaagaacaatac 

225 LIPPDVAYMYAAYDPLQTFELYEFQEQY 

54294 ttgactccaggaactgaacaatgtgaagaatataacctggaaaaagtctcatgggttcttcataatattgagatgcctctaatt 

253 LTPGTEQCEEYNLEKVSWVLHNIEMPLI 

54378 aaagttctcttcgacatggaagtctacggtgtcgacttagaccaagataagctggcagaaattagagaacagtttactgccaat 

281 KVLFDMEVYGVDLDQDKLAE I REQFTAN 

54462 atgaacgaggctgagcaagagtttcaacagcttgtcagcgaatggcagcctgaaattgaagaacttcgacaaactaatctccag 

309 MNEAEQEFQQLVSEWQPBI EELRQTNFQ 

54 546 agctatcaaaaactcgaaatggatgcaagaggtcgagtgacggtaagcatttccagtcctactcaattagcaattctgttttat 

337 SYQKLEMDARGRVTVSISSPTQLAILFY 

54630 gatatcatgggattgaaaagtcctgaaagggataaacctagaggaacaggcgaaagtattgtcgagcatttcgataacgatatc 

365 DIMGLKSPERDKPRGTGSSIVEHFDNDI 

54714 tcaaaagcacttttgaaatatagaaaatatgcaaaattagtttcgacctatacaacacttgaccaacaccttgcaaagcctgac 

393 SKALLKYRKYAKLVSTYTTLDQHLAKPD 

S4798 aatcgaattcacactacattcaaacagtacggagctaagacagggcgtatgtcaagtgagaatcctaacttacagaatattcct 

421 NRIHTTFKQYGAKTGRMSSENPNLQNIP 

54882 tctcgcggtgagggtgcagtagttcgacaaatctttgcagccagcgaagggcattacattattggtagtgactactctcaacaa 

449 SRGEGAVVRQIFAASEGHYI IGSDYSQQ 

54966 gaacctcgttcattggcggaattaagtggcgacgaaagtacgcgacatgcttacgaacaaaacctggacctatatccagttatc 

477 EPRSLAELSGDESMRHAYEQNLDLYSVI 

55050 ggtccgaaactttatggtgttccccatgaagagtgtttagagttctatcccgacggaacgactaacaaggaaggaaaacttcga 

505 GS KLYGV PYE EC LE FY PDGTTNK-E G^_K.. -4 R ? 

55134 agaaattctgtcaagtccgttcttttaggtcctatgtacggccgcggggctaactcaatcgctgagcagatgaatgtatctgtc 

533 RNSVKSVLLGLMYGRGANS IAEQMNVSV 

55218 aaagaagcgaataaggttattgaagatttcttcaccgagttccctaaagtggcagactatatcatattcgttcaacagcaggcg 

561 KEANKVI EDFFTEFPKVADYI I FVQQQA 

55302 caggacttgggatatgttcaaacagctaccggtcgaagaagaaggcttcctgatatgagtcttcctgaatacgagttcgagtat 

589 QDLGYVQTATGRRRRLPDMSLPEYEFEY 

55386 accgacgctagcaagaacgaagatttcgacccctttaactttgacgcagaccaacagatggacgatactgttcctgaacatatt 
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617 IDASKNEDFDPFNFDADQQMDDTVPEHI 

55470 atcgaaaaatattgggcccagctagatagagcctggggatttaagaagaagcaagaaattaaagaccaggcaaaagccgaagga 

645 I EKYWAQLDRAWGFKKKQEI KDQAKAEG 

55554 attcttattaaggataacggaggcaagatagctgatgctcagcgccaatgtttgaactcagttattcaaggaacggcagccgac 

673 ILIKDNGGKIADAQRQCLNSVIQGTAAD 

55638 atgactaagtacgcaatgattaaggtacacaatgacgctgaattgaaagaattaggattccatttaatgattccagttcacgat 

701 MTKYAMIKVHNDAELKELGFHLMIPVHD 

55722 gagttactaggtgaggttcctatcaagaacgcaaaacggggagcagaaaggttgacagaagttatgattgaagcagccaaggac 

729 E LLGEVPI KNAKRGAERLTEVM I E A A K D 

55806 attattagtcttccaatgaaatgtgaccccagtatagtagaaagatggtatggtgaagaaattgaaatctaa 55B77 

757 IISLPKKCDPSIVERWYGEEIEI* 
dplORF004 

40401 atgacaaaatttatcaactcatacggccctcttcacttgaacctttacgtcgaacaagttagtcaggacgtaacgaacaactcc 

1 MTKFINSYGPLHLNLYVEQVSQDVTNNS 

40485 tcgcgagttagttggcgagctactgtcgaccgcgatggagcttatcgaacgtggacttatggaaatattagtaacctttccgta 

29 SRVSWRATVDRDGAYRTWTYGNISNLSV 

40569 tggttaaatggttcaagtgttcatagcagtcacccagactacgacacgtccggcgaagaggtaacgctcgcaagtggagaagtg 

57 WLNGSSVHSSHPDYDTSGEEVTLASGEV. 

40653 actgttcctcacaatagtgacgggacaaagacaatgtccgtttgggcttcgtttgaccctaataacggcgttcacggaaatatc 

85 TVPHNSDGTKTMSVWASFDPNNGVHGNI 

40737 actatctctactaattacactttagacagtantccaaggtctacacagatttctagttttgagggaaatcgaaatctaggatcc 

113 TISTNYTLDSIPRSTQISSFEGNRNLGS 

40821 ttacatacggttatctttaaccgaaaagtgaactcttttacgcatcaagtttggtaccgagtttccggtagcgactggatagat 

141 LHTVIFNRKVNSFTHQVWYRVFGSDWID 

40905 ttaggtaagaaccatactactagcgtatcctttacgccgtcactggacttagcaaggtacttacctaaatcaagttccggaaca 

169 LGKNHTTSVS FTPSLDLARYLPKS SSGT 

40989 atggacatctgtattcgaacctataacggaactacgcaaattggtagtgacgtctattcaaacggatggaggttcaacatcccc 

197 MDICIRTYNGTTQIGSDVYSNGWRFNIP 

41073 gattcagtacgtcctactttttcgggcatttctttagtagacacgacttcagcggttcgacagattttaacagggaacaacctc 

225 DSVRPTFSGI SLVDTTSAVRQI LTGNKF 

41157 ctccaaatcatgtcgaacattcaagtcaacttcaacaatgcttccggcgcttacggatccactatccaagcatttcacgctgag 

253 LQIMSNIQVNFNNASGAYGSTIQAFHAE 

41241 ctcgtaggtaaaaaccaagctaccaacgaaaacggcggcaaattgggcatgatgaactttaatggctccgctaccgtaagagca 

281 LVGKMQAINENGGKLGMMNFNGSATVRA 

41325 tgggttacagacacgcgaggaaaacaatcgaacgtccaagacgtatctatcaatgttatagaatactatggaccgtctatcaat 

309 WVTDTRGKQSNVQDVSINV1 EYYGPS IN 

41409 ttctccgttcaacgtactcgtcaaaatcctgcaattatccaagctcttcgaaatgctaaggtcgcacctataacggtaggaggt 

337 FSVQRTRQNPAI IQALRNAKVAPITVGG 

41493 caacagaaaaacatcatgcaaattaccttctccgtggcgccgttgaacactactaatttcacagaagatagaggttcggcgtca 

365 QQKNI MQ I TF SVA PliNTTN F T E DRG S " A S 

41577 gggacgttcactactatttccctaatgactaactcgtccgcgaacttagctggtaactacgggccggacaagtcttacatagtt 

393 GTPTTISLMTNSSANLAGNYGPDKSYIV 

41661 aaggctaaaatccaagacaggttcacttcgactgaatttagtgctacggtagctaccgaatcagtagttcttaactatgacaag 

421 KAKIQDRFTSTEFSATVATESVVLNYDK 

41745 gacggtcgacttggagttggtaaggttgcagaacaagggaaggcagggtcaattgacgcagcaggtgatatatatgctggaggt 

449 DGRLGVGKVVEQGKAGSIDAAGDIYAGG 

41829 cgacaagttcaacagtttcagctcactgataataatggagcattgaacaggggtcaatataacgatgtttggaataagcgtgaa 

477 RQVQQFQLTDNNGALNRGQYNDVWNKRE 

41913 acagagtttacatggcgaagtaacaaatacgaggacaaccctacgggaactcgaggtgaatggggactatttcaaaatttctgg 

505 TEFTWRSNKYEDNPTGTRGEWGLFQNFW 

41997 ttagatagctggaaaatggttcaatccttcattacaatgtcaggaagaatgttcatcaggacagcgaacgatggaaacagctgg 

533 LDSWKMVQSFITMSGRMF1 RTANDGNSW 

42081 agacctaacaagtggaaagaggttctatctaagcaagacttcgaacagaataattggcagaaacctgttcttcaaagtgggtgg 

S61 RPNKWKEVLFKQDFEQNNWQKLVLQSGW 

42165 aaccatcactcaacctatggcgacgcattctattcgaaaactcttgacggcatagtatatttgagaggaaatgtgcataaagga 

589 NHHSTYGDAFYSKTLDGIVYLRGNVHKG 

4224 9 cttatcgacaaagaggctactattgcagtacttcctgaaggatttagaccgaaagtttcaatgcatcttcaggctctcaataac 

617 LIDKEATIAVLPEGFRPKVSMYLQALNN 

42333 tcatatggaaatgccattctatgtatatacactgacggaagacttgtggtgaaatcgaatgtagataattcttggttaaattta 

645 SYGNAILCIYTDGRLVVKSNVDNSWLNL 

42417 gacaatgtctcatttcgtatttaa 42440 

673 DNVSFRI* 
dplORFOOS 

23674 atggctaaaaaatcaaaagctatctcacacacagacgaactgattagtcagtcgtttgacagccccttggcaaagaatcaaaag 

1 MAKKSKAISHTDELISQSFDSPLAKNQK 

23 75 8 ttcaagaaagagcttcaggaagttgaaaagtattatcaatactccgacggatttgatgtcacggacttgaatactgactatggg 

29 FKKELQEVEKYYQYFDGFDVTDLNTDYG 

23842 caaacatggaagattgacgaagactcagccgactataaacctactcgagaaattcgaaactatactcgacaacttatcaaaaag 

57 QTWKIDEDSVDYKPTREIRNYIR-Q lT_ I- -K ~k* 

23926 caatcacgctttatgatgggtaaagagccagagctcatctttagtccagtccaagacaatcaagacgaacaggctgagaacaag 

85 QSRFMMGKEPELI FSPVQDNQDEQAENK 

24010 cgtattctattcgactctatttcaaggaatcgtaaattctggagcaaaagtacaaatgcattagtcgacgccacagtaggtaag 

113 RILFDSILRNCKFWSKSTNALVDATVGK 

24094 cgggtattgatgacagtagtagcaaatgccgctcaacaaattgacgtccagttttattcaatgcctcagttcacctatacagtt 

141 RVLMTVVANAAQQIDVQFYSMPQFTYTV 

24178 gaccctagaaacccttccagcttgctttctgttgacattgtttatcaggacgagcgcacaaaaggaatgagcactgaaaaacaa 
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169 DPRNPSSLLSVDIVYQDERTKGMSTEKQ 

24262 ctttggcatcattatagatatgaaatgaaagctggaacaagtcaatcaggaattgcaacagctttagaagacaetgaagaacaa 

197 LWHHYRYEMKAGTSQSGIATALEDIEEQ 

24346 tgttggctcacttatgccctaacggatggagagtcgaaccaaatctatatgacagaaagtggccaaactactatcaaggagaca 

225 CWLTYALTDGESNQIYMTESGQTTIKET 

24430 gaggctaaacttgtagaaattgaagacaacctaggaaacaagattgaagttcctttaaaagttcaagaatccgccccaaccggc 

253 EAKLVEIEDNLGNKIEVPLKVQESAPTG 

24514 ttgaagcaaattccttgtcgagttattcttaatgaaccattgactaatgacatatacgggacaagcgatgtcaaagaccttatc 

281 LKQIPCRVILNEPLTNDIYGTSDVKDLI 

24598 acagtagcagataacttgaacaaaactattagtgacttacgagattcacttcgatttaaaatgttcgagcagcctgttatcatt 

309 TVA DNLNKT I SDLRDSLRFKMF EQPVI I 

24682 gatggctcttctaagtcaattcaaggaatgaagactgcgccaaacgctttggtcgaccttaagagtgaccctacttcctcaatc 

337 DGSSKSIQGMKIAPNALVDLKSDPTSSI 

24766 ggcggtactggaggcaagcaagctcaagtcacttccatttcaggaaacttcaacttccttccagcggctgaatattatttagag 

365 GGTGG KQAQVT S I SGNFNFLPAAEYYLE 

24850 ggcgctaagaaagccatgtatgaactaatggaccagccaatgcctgaaaaggtacaggaggcgccatcaggaattgcaatgcag 

393 GAKKAMYELMDQPMPEKVQEAPSGIAMQ 

24934 tcctcactctacgacctaatttctcgatgtgacggaaaatggattgagtgggatgatgctattcaatggctcattcaaatgctg 

421 FLFYDLISRCDGKWIEWDDAIQWLIQML 

25018 gaagaaattttagcaacagtgaatgttgacttgggaaatattcctcaagatattcaatcaagttatcaaacacttacgacaatg 

449 EE I LATVNVDLGNI PQDIQSSYQTLTTM 

25102 actatcgaacaccactatccaattcctagcgatgaactttctgctaagcaacttgcgctcaccgaagttcaaactaatgtacgc 

477 TI EHHYPI PSDELSAKQLALTEVQTNVR 

25186 agccaccaatcttacattgaagaattcagtaagaaggaaaaggcggacaaggaatgggaacgcattttggaagaacttgctcag 

505 SHQSYIEEFSKKEKADKBWERI LEELAQ 

25270 cttgacgaaatctcagctggagcattgcctgtattagcaaacgaattaaacgaacaagaggagcctcaagatgaaacgagtgaa 

533 LDEISAGALPVLANELNEQEEPQDETSE 

25354 gaagacgaagttgatgacaaagaaaaagaacaaactgaacaaccaaccgaagaaggagtcgacccagacgttcaaggttaa 
25434 

561 EDEVDDKEKEQTEQPTBEGVDPDVQG* 
dplORF006 

45296 atgattgaaatcgttatagcacgttcgaaagctaggcgaggtcgaaccctatttattgaaacatgggcaagcactgatgaagat 

1 MI EIVIARSKARRGRTLFIETWASTDED 

45380 gcagttaaaatggcagaaaagatttccagcttgcccaatgtagtcgagacgtcttctaataacttcgaactaccttataagtat 

29 AVKMAEKISSLPNVVETSSNNFELPYKY 

45464 ttcaataatgttatagacgctctagatgaatgggagcttcacatcttcggcgaacttgataaagatgttcaagactacattgac 

57 FNNVIDALDEWELHI FGELDKDVQDYI.D 

4554 B tctcgaaaccgaatagcttcttcaagcaatgagcagttttcgtccaagactactccattcgcgcaccaggttgaatgtttcgaa 

85 SRNRIASSSNEQFSFKTTPFAHQVECFE 

45632 tacgcacaagagcatccatgtttccttttaggcgatgagcaaggtttagggaaaactaaacaggcaattgatattgcagttagc 

113 YAQEHPCFLLGDEQGLGKTKQAIDIAVS 

45716 aggaaggcaagtttcaaacattgtttaatcgtatgttgcatatcagggctcaaatggaattgggcaaaagaagtaggtattcat 

141 RKASFKHCLIVCCISGLKWNWAKEVGIH 

45800 tcaaatgagtcagctcatattttaggaagtcgagtcactaaagatgggaaattagtgattgacggagtttctaaacgggcagaa 

169 SNESAHILGSRVTKDGKLVIDGVSKRAE 

45884 gacttgcttggtggccacgacgaattcttccttatcactaacattgaaactcttcgcgatgctgtgttcactaaatacctaaat 

197 DLLGGHDEFFLITNIETLRDAVFIKYLN 

45968 gaactgacaaaaagcggagaaattggaatggtcactattgacgagattcacaagtgtaagaacccttcaagtaagcaaggggct 

225 EliTKSGEIGMVIIDEIHKCKNPSSKQGA 

46052 tcaattcaaaagctccaaagttattacaagatgggacttacaggaactcctctaatgaataacccaatcgatgtattcaatgtt 

253 SIQKLQSYYKMGLTGTPLMNNP I DVFNV 

46136 atgaagtggctaggggcggaacatcatacactgactcagttcaaagagcgatactgtatcgtcgaccagttcaatcaaatcact 

281 MKWLGAEHHTLTQFKERYCIVDQFNQIT 

46220 ggatatcgaaatctagctgaacttcgcgagcttgtcaacgactacatgcttagaagaacgaaggaagaagttttagacctgcct 

309 GYRNLAELRELVNDYMLRRTKEEVLDLP 

46304 gaaaagattcgagtcacagagtatgtcgacatgaactcgaaacagtcaaaaatctataaggaagttttgactaaacttgttcaa 

337 EKIRVTEYVDMNSKQSKIYKEVLTKLVQ 

46388 gaaatagataaagtcaagctcatgcctaaccctctagccgaaacgattcgacttcgacaagcgactggaaatccttcgatttta 

365 EIDKVKLMPNPLAETIRLRQATGNPSIL 

46472 actactcaagatgtcaagtcttgcaagttcgaaagatgtatcgaaattgtcgaggaatgtatccagcaaggaaagtcctgcgtg 

393 TTQDVKSCKFERCIEIVBECIQQGKSCV 

46556 atatttagcaattgggaaaaggttattgaacctcttgctaagatactttcgaagacagtcaaatgcaacctggtaacaggagaa 

421 IFSNWEKVIEPLAKILSKTVKCNLVTGE 

46640 accgcagataagttcaacgaaattgaagaatttatgaatcacagaaaggcttctgttattttaggaactataggtgcgctagga 

449 TADKFNEI EEFMNHRKASVILGT ZGALG 

46724 acaggatttactttgacgaaagcggatacggttattttcttagatagtccgtggacacgcgcagaaaaggaccaagccgaagat 

477 TGFTLTKADTVI FLDS PWT RAEKDQAED 

46808 aggtgtcatagaattggcgcaaaaagttctgtcactatccacacgcttgtcgccaaaggtactgtcgacgaacg^tatagaag^c 

505 RCHRIGAKSSVTIYTLVAKGTVDE^RIED 

46892 cttattgaacggaaaggagaattagcagattatatcgtagatggtaagcctatgaaatctaaaattggtaaccttttcgatatc 

533 LIERKGELADYIVDGKPMKSKI GNLFDI 

46976 ctgcttaaatag 46987 

561 L L K * 
dplORF007 

22230 atgacaataagcctgagaaataaactacctaagttcaacttcgtcccttttagtaagaaacaactccagctcctaacatggtgg 
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1 MTISLRNKLPKFNFVPFSKKQLQLLTWW 

22314 acaaagggctcaccttttcgaactttcgatatcgccatagcagacggttccattcgttcaggaaaaacagtatcgatggctctt 

29 TKGSPFRTFDIVIADGSIRSGKTVSMAL 

22398 tcattttccctttgggccatgacggaattcaacggacaaaactttgccatctgtggtaagacaattcactcagctcgacgaaat 

57 SFSLWAMTEFNGQNFAICGKTIHSARRN 

224 82 gttattcagcctctaaagcaaatgctcacaagtcgcgggtatgaaattcgagatgttcgaaatgaaaatctacttattattaga 
85 VIQPLKQMLTSRGYEIRDV RNENLLIIR 
22566 cactttagaaatggcgaagaaattgtcaactactcccatatatttggaggaaaagatgagtcgagtcaagaccttatacagggg 
113 HFRNGEE I V N Y F Y I FGGKDE S SQDLXQG 
22650 gtaacattagcaggtatcttctgtgatgaggtggcactgatgcctgaatcgtttgtcaaccaagcgacagggcgctgttccgta 
141 VTLAGI FCDEVALMPESFVNQATGRCSV 
22734 acaggttcgaaaatgtggttctcttgtaacccggccaatcctaatcactacttcaagaagaactggattgacaaacaggtcgaa 
169 TGSKMWFSCNPANPNHYFKKNWIDKQVE 
22818 aagcgtatcttatatcttcactttacaatggacgacaaccctagcttgacggatagcactaaaaggcgctatgagaaaatgtat 
197 KRI LYLHFTMDDNPSLTDS I KRRYEKMY 
22902 gctggagtcttcaggaaaagatttattctcggcctttgggtaacagcagatggtctagtttattcaatgttcaatgaagagcag 

225 AGVFRKRFILGLWVTADGLVYSMFNEEQ 
22986 catgtcaaaaagctcaatatagaattcgaccgtttattcgtagcaggcgactttggcatctataatgcaacaaccttcggcctt 
253 HVKKLNI EFDRLFVAGDFG I YNATTFGL 
23070 tatggattctcgaaacgtcataagcgctaccatctaattgagtcacactaccactcagggcgcgaggcggaagagcaactaact 
281 YGFSKRHKRYHLIESYYHSGREAEEQLT 
23154 gaggcggatgttaattcgaatattcaatctagttcagttctacaaaagactactaaagagtacgcaaatgatttagtcgatatg 
309 EADVNSNI QFSSVLQKTTKEYANDLVDM 
23238 atacgaggaaagcaaatcgaatatataattctcgacccgtctgcttctgctatgattgttgaacttcaaaagcatccttatata 
337 IRGKQIEY I I LDPSASAMI VELQKHPYI 
23322 gctagaaagaatatccctatcattcctgctcgaaatgacgtgacgcttggcatttcatttcacgctgaactcttggctgagaat 
36S A R K N I PI I PARNDVTLGIS FHAELLAEN 
234 06 agatttacactcgaccctagcaacacgcacgacattgatgaatactatgcttacagctgggacagtaaagcgagccaaacggga 
393 RFTLDPSNTHDIDEYYAYSWDSKASQTG 
234 90 gaagatagagtcattaaagagcatgaccactgcatggataggaacagatatgcctgtctcactgacgctctaatcaacgatgac 
421 EDRVIKEHDHCMDRNRYACLTDALINDD 
23574 ttcggtttcgaaatacaaatattatccggaaaaggcgctagaaactaa 23621 

449 FGFEIQILSGKGARN* 
dplORPOOB 

49624 gtgatacagcttcaagtcttaaataaagttctcgaagaaaagagcttatccattttagaaaataatggaattgaccaagaatac 

1 VIQLQVLNKVLEEKSLSIIiENNGIDQEY 

49708 ttcacggattatttagacgagtatcaatttattcaagaacacttttcgagatatggaagagttccggacgacgaaactattctc 

29 FTDYLDEYQFIQEHFSRYGRVPDDETIL 

49792 gaccattttcctggattcgaatttttcgaaattggcgaaactgatgaataccttatcgacaagctaaaagaggagcatctatat 

57 DHFPGFEFFEIGETDEYLIDKLKEEHLY 

4 9876 aattcacttgttccaattttaacggaagcggctgaggacattcaagtagatagtaacattgcgattgcgaatataattccaaaa 

85 KSLVPILTEAAEDIQVDSNIAIANIIPK 

49960 ctagaagaacttttcaatcgctctaaattcgtaggcggactagacattgctcgaaatgctaaacttcgactagactgggcgaat 

113 LEELFNRSKFVGGLDIARNAKLRLDWAN 

50044 actattagaaaccatgacggtgaaagacttggaatatcgacagggtttgaactattggacgacgtgcttggaggcttacttcct 

141 TIRNHDGERLGISTGFELLDDVL.GGLLP 

50128 ggtgaggatttgattgtcataatggctcgacctggacaaggtaagtcgtggactactgataaaatgcttgcaactgcttggaag 

169 GEDLIVIMARPGQGKSWTIDKMLATAWK 

50212 aacgggcatgatgtccttctatacagcggggaaacgagcgaaatgcaagttggcgctcgtatagatactattctttcgaatgtt 

197 NGHDVLLYSGEMSEMQVGARIDTILSNV 

50296 agcatcaatt caattaccaaagggat t tggaacgaccat cagt tcgaaaaatatgaggaccatattcaagcaatgactgaggct 

225 SINSITKGIWNDHQFEKYEDHIQAMTEA 

50380 gaaaattcccttgtggtagtcacgccctttatgactggaggaaagaaccttacccctgcaattttagatagcatgatatctaaa 

253 ENSLVVVTPFMIGGKHLTPAIIjDSMISK 

50464 tatagaccatctgtggtggggattgaccagctttcactcatgagcgagtcttatccaagcagggagcagaagcgaatccagtac 

281 YRPSVVGIDQLSLMSESYPSREQKRIQY 

50548 gccaacatcaccatggacctatataagatttctgctaaatatggaattcctattgtgcttaatgtccaagcagggcgttcggct 

309 A N I TMDLYKI SAKYGI PIVLNVQAGRSA 

50632 aaaactgaaggcgctgaaagtatggaactagaacatatagcagaaagtgatggagtaggtcaaaatgctagcagagttatcgct 

337 KTEGAESMELEHIAESDGVGQNASRVIA 

50716 atgaagcgtgacgaaaaatccggcatacttgaactatctgtcgttaaaaaccgatatggcgaagaccgaaaaatcatcgaatat 

365 MKRDEKSGILELSVVKNRYGEDRKIIEY 

50800 atgtgggacgttgaaactggaacctatactcttataggattcaaagaggaaggcgaagaaggaactgaaaaaggcgaaagctct 

393 MWDVETGTYTLIGFKEEGEEGTEKGESS 

50884 ccattgaaagcaaaagcctctaggtcgactgctcgtcttcgaagtaaggttacaagggaaggagttgaagcattttga 50961 

421 PLKAKASRSTARLRSKVTREGVEAF* 
dplORF009 

13160 atgacagactttaaaaaacgcttcaagaaagcagtaacagaaacaatcaatcgtgacggtatcgagaaccttatggateggctc 

1 MTDFKKRFKKAVTETINRDGIEN-L pT_D- -W 

13244 gaaaatgataccaatttcttctcaagtccagcaagcactcgataccatggaagctatgaaggtggacttgtcgagcactcatta 

29 ENDTNFFS SPASTRYHGSY EGGLVEHSL 

13328 aacgtgttcaatcaactacttttcgaaatggataccatggtaggcaaaggctgggaagacatttacccaatggaaacagttgca 

57 NVFNQLLFEMDTMVGKGWEDIYPMETVA 

13412 atcgtagcactatttcacgacctttgcaaagttggtcagtatcgcgaaactgaaaaatggcgcaagaacagcgacggtgaatgg 

85 IVALFHDLCKVGQYRETEKWRKNSDGEW 

13496 gaaagctatctagcatatgaatacgaccccgagcaacttacaatgggacatggtgcaaaatctaatttccttcttcaacgtttc 
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113 ESYLAYEYDPEQLTMGHGAKSNFLLQRF 

13580 attcaactcacgccagttgaagctcaagcaattttctggcacatgggagcctatgatattagtccttatgcaaatttgaatgga 

141 IQLTPVEAQAI FWHMGAYD I SPYANLNG 

13664 tgtggagcagccttcgaaactaatccacttgcattcttaatccatcgcgcagatatggccgcaactcatgtagtcgaaaatgaa 

169 CGAAF ETNPLAFLIHRADMAATYVVENE 

13748 aacttcgaatactctcaaggtccagttgaacaagaggctgaggttgaagaagtagttgaagaaaaacctaagagttcaactcgt 

197 NFEYSQGPVEQEAEVEEVVEEKPKSSTR 

13832 aagaaacctgcgcctaaggaagaaaaagttgaagaggctgaagaaaaaccaaaagctggaatcactcgacgtcgcaaacctgcg 

225 KKPAPKEEKVEEAEEKPKAG ITRRRKPA 

13916 ccaaaagaggaagaggtagaagagcctaaagaagagcctaagaaagcatcttctaaaattcgaatgcctaaaaagactgaaaag 

253 PKEEEVEEPKEEPKKASSKI RMPKKTEK 

14000 gtcgaagaggtagaaagcgcagacgagccgaaagttgaagaagcagaggacgacaatgtggtggtacctgctggatatgttcga 

281 VEEVESADEPKVEEAEDDNVVVPAGYVR 

14084 gatgtctactacttctacagtgaagtcgctgacgtttactacaagaaagatgtcgacgagcctgacgatgacagcgacattctt 

309 DVYYFYSEVADVYYKKDVDE PDDDSDIL 

14168 gtagacgaagaagagtacatggacgcaatgtgtcctgtattagaagaagacttcttctacgaacttgacggcaaggttcacaaa 

337 VDEEEYMDAMCPVLEEDFFYELDGKVHK 

14252 ttagcaaaaggtgaacgcttgccggaagaatacgacgaagaaacctgggaacctatcactgaagcagaatacatcaagcgaaca 

365 LAKGERLPEEYDEBTWEPITEAEYIKRT 

14336 gaaaaacctaaagcagttgcaaaacctactcgaaaaactccagcgccttctcgtcgccctcgcccttaa 14404 

393 EKPKAVAKPTRKTPAPSRRPRP * 

dplORFOlO 

8699 atgaaattggaacagttgatgaaggactggaataaggattcgaaagctcttgtagcagttcaaggacttgaacgtgaagcgctt 

1 MKLEQLMKDWNKDSKALVAVQGLEREAL 

8783 ccaagaatccctttttctgcgccttctatgaattatcaaacctacggcgggctccctcgaaaaagggtagttgaattcttcggt 

29 PRI PFSAPSMNYQTYGGLPRKRVVEFFG 

8867 cctgagtcaagtgggaaaactacttcagctctcgacattgtcaagaatgcgcaaatggtatttgagcaggaatgggaacagaag 

57 PESSGKTTSALDIVKNAQMVFEQEWEQK 

8951 actgaagaactcaaggaaaagctggaaaatgcgcgtgcatccaaagctagcaagactgctgtcaaggaacttgaaatgcaactc 

85 TEELKEKLENARASKASKTAVKELEMQL 

9035 gatagtcttcaagagcctcttaagattgtatatcttgaccttgagaatacattagacactgagtgggctaaaaagattggagtc 

113 DSLQE PLKIVYLDLENTLDTEWAKKIGV 

9119 gatgttgacaatatttggatagttcgccctgaaatgaacagcgctgaagaaatacttcaatatgttttagacattttcgaaaca 

141 D V D N I WIVRPEMNSAEEILQYVLDI FET 

9203 ggtgaagttggcctagtagttctagattccttgccttacatggtcagtcaaaaccttattgatgaagagttgactaaaaaggcc 

169 GEVGLVV LDSLPYMVSQNLI DEELTKK A 

9287 tatgcaggaatctcagcgcctttgactgaatttagtcgaaaggttactcctcctcttactcgctacaatgcaatattcctaggc 

197 YAGISAPLTEFSRKVTPLLTRYNAIFLG 

9371 atcaatcaaattcgagaagatatgaatagtcagtacaatgcctattcaactccaggcggaaagatgtggaagcatgcttgtgca 

225 INQIREDMNSQYNAYSTPGGKMWKHACA 

9455 gttcgacttaaatttagaaaaggtgactaccttgacgaaaacggtgcatcattgacccgtactgctcgaaaccctgcagggaat 

253 VRLKFRKGDYLDENGASLTRTARNPAGN 

9539 gtagtagagtcattcgtcgagaagaccaaagcatttaagccggacagaaaattagtttcctatacgctttcctatcatgatgga 

281 VVESFVEKTKAFKPDRKIjVS ytlsyhdg 

9623 attcaaattgaaaatgaccttgtagatgtcgctgtcgaatttggagtcattcaaaaggcaggggcatggttcagtatcgtcgac 

309 IQIENDLVDVAVEFGVIQKAGAWFSIVD 

9707 cttgaaactggagaaattatgacagatgaagacgaagaaccattgaagttccaaggcaaggcaaatctagttcgacgcttcaag 

337 LETGEIMTDEDEEPLKFQGKANLVRRFK 

9791 gaggatgactacttattcgacatggtgatgactgcggttcacgaaattatcactcgagaagaaggctaa 9859 

365 EDDYLFDMVKTAVHEI ITREEG * 
dplORFOll 

28017 atgaatatttatgattatatcaacgcaggggagattgctagctacattcaagcacttccttcaaacgctcttcaataccttgga 

1 MNIYDYINAGEIASYIQALPSNALQYLG 

28101 ccaactcttttccctaatgctcaacaaacagggacagacatttcatggctcaagggtgcaaataatttgccagtaactatccag 

29 PTLFPNAQQTGTDISWLKGANNLPVTIQ 

28185 ccatctaactacgacgcgaaagcaagtcttcgtgaacgtgctggatttagcaaacaagc tact gaga tggcattcttccgtgag 

57 PSNYDAKASLRERAGFSKQATEMAFFRE 

28269 tctatgcgacttggtgaaaaagaccgtcaaaacttgcaaatgctattgaaccaaagttcagctcttgcccaaccacttatcact 

85 SMRLGEKDRQNLQMLLNQSSALAQPLIT 

28353 caactctataatgatactaagaaccttgtagacggtgttgaagcgcaagcagaatacatgcgtatgcaattgcttcaatacggt 

113 QLYNDTKNLVDGVEAQAEYMRMQLLQYG 

28437 aaattcactgtcaaatcaactaacagcgaggctcaatacacttacgactacaacatggatgctaagcaacaatatgcagtcact 

141 KFTVKSTNSEAQYTYDYNMDAKQQYAVT 

28521 aagaaatggactaacccagctgaaagtgaccctatcgctgacattttagcagcaatggatgacatcgaaaatcgtacaggtgtt 

169 KKWTNPAESDPIADILAAMDDIENRTGV 

28605 cgccctactcgaatggtctcgaaccgaaacacttataaccaaatgactaagagtgactctatcaagaaagctcttgcaattggt 

197 RPTRMVLNRNTYNQMTKSDSIKKALAIG 

28689 gttcaaggttcttgggaaaacttcttgcttcttgcaagtgacgctgagaaattcatcgctgaaaaaacaggtct^aaatcict 

225 VQGSWENFLLLASDAEKFIAEKTGJiQIA 

28773 gtccactctaagaaaattgctcagttcgctgacgctgacaaacctcctgacgttggtaacattcgtcagttcaacttgattgac 

253 VYSKKIAQFADADKLPDVGNIRQFNLID 

28857 gacggtaaagtggtattgcttccacctgacgcagttggtcacacttggtacggtactactccagaagcattcgacttggcttca 

281 DGKVVLLPPDAVGHTMYGTTPEAFDLAS 

28941 ggcggaacagacgctcaagttcaagttctttcaggcggacctaccgttacaacttatcttgaaaaacatcctgccaacattgca 

309 GGTDAQVQVLSGGPTVTTYLEKHPVNIA 
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29025 acagttgtatcagctgttatgattccatcattcgaaggaattgactatgtaggagttctcacaactaattag 29096 

337 TVVSAVMI PSFEGIDYVGVLTTN* 
dplORF012 

5346 atgagtattaagttcaaaaccgaagaactttcaaaaattgtttctcagctcaataagttgaagcctagcaagttgctagaaatc 

1 MSIKFKTEELSKIVSQLNKLKPSKLLEI 

5430 acaaactattggcatatttttggtgacggcgaacgcgtcatgtttacagcgtatgatggctcaaacttccttcgatgcattatc 

29 TNYWHI FGDGECVMFTAYDGSNFLRCII 

5514 gacagcgatgttgaaattgacgtgactgtgaaagcagagcagtttggaaaacttgtagaaaagaccacggccgcaaccgtcaca 

57 DSDVEIDVIVKAEQFGKLVEKTTAATVT 

5596 ttagttcctgaagaatcttcgctaaaagttattgggaatggtgagtacaatattgatattgttacagaagatgaagagtaccct 

85 LVPEESSLKVIGNGEYNIDIVTEDEEYP 

5682 acattcgaccacttgctcgaagacgtgagtgaagaaaatgctctcactttgaaaagctcgctgttctacggaatcgccaatatc 

113 T FDH LLEDVS EENALTLKS SLFYGI.ANI 

5766 aacgattctgcggtatctaaatcaggagcagatggaatttataccggcttcctgttaaaaggcggaaaagcaattactacagac 

141 NDSAVSKSGADG IYTGFLLK GGKAITTD 

5850 atcattcgcgtatgtatcaaccctatcaaggaaaagggactagaaatgctcattccttacaacctaatgagtattttagcaagt 

169 I IRVCIN PIKEKGLEMLI PYNLMSILAS 

5934 attcctgatgagaagatgtacttctggcaaactgacgatactactgtctatatttcatcggcttcagtcgaaatttatggaaaa 

197 IPDEKMYFWQIDDTTVYISSASVEIYGK 

6018 ttgatggaaggtatggaagattatgaagacgtttcacagcttgactcaattgagtttgaagatgatgcggctatccctacagca 

225 LMEGMEDYEDVSQLDSIEFEDDAAIPTA 

6102 gaaatcccgagcgtattagaccgccttgtactattcacttcagcctttgacaaaggaaccgtcgaattcttattcttgaaagac 

253 EILSVLDRLVLFTSAFDKGTVEFLFLKD 

6186 cgacttcgaattaaaacttctactagcagttatgaagacatcatgtacgcatctgctggcaagaaagtttcgaagaaagaattc 

281 RLRIKTSTSSYEDIMYASAGKKVSKKEF 

6270 acttgccaccttaacagcttactcttgaaggaaattgtatcaaccgtcaccgaagaaaacttcactgtctcttatggaagcgaa 

309 TCHLNSLLLKEIVSTVTEENFTVSYGSE 

6354 accgcaattaagatttcatcgaatggtgtcgtttacttcctagcacttcaagagccggaagaataa 6419 

337 TAI.KISSNGVVYFLALQEPEE* 
dplORF013 

10215 atgaatttagcttctaaataccgtcctcaaactttcgaggaagtggtagctcaagaatatgtcaaagaaattcttttgaatcaa 

1 MNLASKYRPQTFEEVVAQEYVKEILLNQ 

10299 ttacaaaatggcgctatcaaacacggctatctattctgtggtggcgctggaactggtaaaaccactactgctcgaattttcgcg 

29 LQHGAIKHGYLFCGGAGTGKTTTARIFA 

10363 aaggatgtgaacaaaggacttggctctcctattgaaattgatgctgcttctaataatggggtagaaaatgttcgaaacattatt 

57 KDVNKGLGSPIEIDAASNNGVENVRNII 

10467 gaagattctagatacaagtctatggacagcgagttcaaagtttacatcattgacgaggttcatatgctttcaaccggagcattt 

85 EDSRYKSMDSEFKVYIIDBVHMLSTGAF 

10551 aatgcgctgttgaaaacattagaagagccctcatcgggaaccgtgttcattctatgtactactgaccctcaaaagattcctgac 

113 NALLKTLEEPSSGTVFI LCTTDPQKI PD 

10635 actattctcagtcgagttcaacggtttgaccttactcgaattgataatgacgacatcgttaatcaacttcaatttattatcgaa 

141 TILSRVQRFDFTRIDNDDIVNQLQF1IE 

10719 agtgaaaatgaagaaggagctggttatagttatgagcgtgacgccctttcgtttattgggaaacttgcaaatggaggaatgcgt 

169 SENEEGAGYSYERDALSF1GKLANGGMR 

10803 gacagtatcacaaggctcgaaaaagtccttgattatagtcatcacgttgacatggaagccgtttctaatgcactaggagttccg 

197 DSITRLEKVLDYSHHVDMEAVSNALGVP 

10887 gactacgaaacatccgcttcacttgttgaagctattgccaactatgacggctcaaagtgtttagaaattgtaaatgacttccac 

225 DYETFASLVEAXANYDGSKCLE IVNDFH 

10971 tactcaggaaaagacttgaaattagtgactcgaaactttacagacttccttttagaggtttgtaagtattggctagttcgagat 

253 YSGKDLKLVTRNFTDFLLEVCKYWLVRD 

11055 atttcaatcactcaacttcctgctcattttgaaagtaagccagagcaatcctgtgaggcttttcaatatcctactctattgtgg 

281 ISITQLPAHFESKLEQFCEAFQYPTLLW 

11139 atgctagaagaaatgaatgaacttgctggagttgttaaatgggagcctaatgctaaaccgataattgaaaccaaacttcttttg 

309 MLEEMNELAGVVKWEPNAKPI IETKLLL 

11223 atgagcaaggaggagtga 11240 

337 M S K E E * 
dplORF014 

50961 atgaaagtaaatggtcttcaaattgaagcgactcctgaacaaataattgaaaaactttcgagacaacttgaagacgaaggaaca 

1 MKVNGLQI EATPEQI IEKLSRQLEDEGT 

51045 ttcatttttagacgaactaagtcgcttggaagcaactatcaattctcatgcccgtttcatgcaggagggactgaaaagcatccc 

29 FIFRRTKS LGSNYQFSCPFHAGGTEKHP 

51129 tcttgtggcatgagtagaaatccttcttattcaggaagtaaggtgacggaagctggaacggttcactgtttcacttgcggctac 

57 SCGMSRNPSYSGSKVTEAGTVHCFTCGY 

51213 actccaggactaactgaattcgtctcgaatgtattaggtcgaaacgatggagggttctatggaaaccagtggctgaaaaggaat 

85 TSGLTEFVSNVLGRNDGGFYGNQWLKRN 

51297 tttggaacatctagcgaagtagttaggcaaggcgtcagccctgaagcgtttcgaagaaatgggagaactgaaaaagtcgagcat 

113 FGTSSEVVRQGVS PBAFRRNGRTEKVEH 

51381 aaaatcattcctgaagaggaacttgataaataccggtttattcatcctcatatgtatgaacggaaattgacggacgagctcatc 

141 KIIPEEELDKYRFIHPYMYERKL-T d"_E~ -fc t " 

51465 gagatgtttgatgtaggttatgacaaactgcatgattgcatcacctttccagtacggaacctcaagggcgaaacagtattcttc 

169 EMFDVGYDKLHDC ITFPVRNLKGETVFF 

51549 aaccgtcgaagtgttcgttctaagtttcaccagtacggtgaagatgaccctaaaacggaatttctttatggccaatatgagctt 

197 NRRSVRSKFHQYGEDDPKTEFLYGQYEL 

51633 gtagcatttcgagactattttgaaaaacccattagtcaagtattcgtgactgagtctgttatcaactgcttgactctttggtca 

225 VAFRDYFEKPISQVFVTESVINCLTLWS 

51717 atgaagattccagcagtcgctcttatgggagtaggtggaggaaatcaaatcaatttactaaaacgacttccttatagaaatatt 
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253 MKI PAVALMGVGGGNQINLLKRLPYRNI 

51801 gttctagcacttgaccctgataacgctgggcagacagcgcaggaaaaactctaccgacagttaaagcgaagcaaggtcgttaga 

281 VLALDPDNAGQTAQEKLYRQLKRSKVVR 

51885 tttttgaactaccctaaagagttctatgataataagtgggatataaacgaccatccggaattattaaattttaatgatttagtc 

309 FLNYPKEFYDNKWDINDHPELLNFNDLV 

51969 ttgtag 51974 

337 L * 
dplORPOlS 

3793 atgggatttaatctatacttcgcaggaggtcacgctattagcactgacgattatttgaaggaaagaggagccaaccgcctattc 

1 MGFNLYFAGGHAI STDDYLKERGANRLF 

3877 aatcaactgtacgaaagaaacgggattggcaaaaggtggattgagcataagaaaaccaatccaagcactacttcaaaactattc 

29 NQLYERNGIGKRWIEHKKTNPSTTSKLF 

3961 gtcgactctagtgcatattctgctcataccaaaggggctgaagttgacattgacgcctatatcgaatacgtgaatgataacgtg 

S7 VDSSAYSAHTKGAEVDIDAYI EYVNDKV 

4045 ggaatgtttgactgtatcgccgaactcgataaaattcctggtgtatttagacagcctaagacacgtgaacagcttttggaagca 

85 GMFDCIAELDKI PGVFRQPKTREQLLEA 

4129 ccacaaatttcttgggataattatctatacatgcgcgagcgaatggttgagaaagacaagctcttacctattttccatatggga 

113 PQI SWDNYLYMRERMVEKDKLLPIFHMG 

4213 gaagactttaaatggctcaacttgatgctcgaaactacattcgaaggcggaaagcatattccttacattggaatttcaccagcc 

141 EDFKWLNLMLETTFEGGKHIPYIGISPA 

4297 aatgactcgactacgaagcataaagacaagtggatggaaagagtattcgaagttattcgaaacagttctaatccagacgttaag 

169 NDSTTKHKDKWMERVFEVIRNSSNPDVK 

4381 actcacgcatttgggatgacagttactagccaattagagcgtcacccattctatagcgccgactctacttctgtactgctcaca 

197 THAFGMTVTSQLERHPFYSADSTSVLLT 

4465 ggagcgatgggaaacattatgacgtcaaaaggattagttgacttgtcacagaagaatggaggaattgatgctgtccgtaggctg 

225 GAMGN I-.MT SKGLVD LSQKNGG I OAVRRL 

4549 ccaaaaccggttcaagttgaaattgaatccattatcgaagaaactggagcgcattttagcctagagcaattagttgaggactat 

253 PKPVQVEIES I IEETGAHFSLEQLVEDY 

4633 aaacttcgagcattgttcaatgttcaatacatgctgaattgggcagagaactatgaattcaagggaattaaaaatcgtcaacgt 

281 KLRALFMVQYMLNWAENYEFKGIKNRQR 

4717 cgactattttag 4728 

309 R L F * 
dplORF016 

43413 atgggagtcgatattgaaaaaggcgtcgcgtggatgcaggcccgaaagggtcgagtatcttatagcatggactttcgagacggt 

1 MGVDI EKGVAWMQARKGRVS YSMDFRDG 

434 97 cctgatagctatgactgctcaagttctatgtactatgctctccgctcagccggagcttcaagtgctggatgggcagtcaatact 

29 PDSYDCSSSMYYALRSAGAS SAGWAVNT 

43581 gagtacatgcacgcatggcttattgaaaacggttatgaactaattagtgaaaatgctccgtgggatgctaaacgaggcgacatc 

57 EYMHAWLIENGYELISENAPWDAKRGDI 

43665 ttcatctggggacgcaaaggtgctagcgcaggcgctggaggtcatacagggatgttcattgacagtgataacatcattcactgc 

85 FIWGRKGASAGAGGHTGMFIDSDNIIHC 

43749 aactacgcctacgacggaatttccgtcaacgaccacgatgagcgttggtactatgcaggtcaaccttactactacgtctatcgc 

113 NYAYDGISVNDHDERWYYAGQPYYYVYR 

43833 ttgactaacgcaaatgctcaaccggctgagaagaaacttggctggcagaaagatgctactggtttctggtacgctcgagcaaac 

141 LTNANAQPAEKKLGWQKDATGFWYARAN 

43917 ggaacttatccaaaagatgagttcgagtatatcgaagaaaacaagtcttggttctactttgacgaccaaggctacatgctcgct 

169 GTYPKDEFEYIEENKSWFYFDDQGYMLA 

44001 gagaaatggttgaaacatactgatggaaattggtattggttcgaccgtgacggatacatggctacgtcatggaaacggattggc 

197 EKWLKHTDGNWYWFDRDGYMATSWKRIG 

44085 gagtcatggtactacttcaatcgcgatggttcaatggtaaccggttggattaagtattacgataattggtattattgtgatgct 

225 ESWYYFNRDGSMVTGWI KYYDNWYYCDA 

44169 accaacggcgacatgaaatcgaatgcgtttatccgttataacgacggctggtatctactattaccggacggacgtctggcagat 

253 TNGDMKSNAFIRYNDGWYLLLPDGRLAD 

44253 aaacctcaattcaccgtagagccggacgggctcattactgctaaagtttaa 44303 

281 KPQFTVEPDGLITAKV* 
dplORF017 

11242 atgattggacagggacttgttaaatctaccatttcgaaatggaaacaacttccaaaatatataatcgtcgaaggtgaagtaggt 

1 MIGQGLVKSTISKWKQIjPKYI IVEGEVG 

11326 tcaggacggaagaccttaatccgttatattgcttcgaaatttgacgctgattctattgtagtaggaacgagtgtagatgacatt 

29 SGRKTLI RYIASKFDADSIVVGTSVDDI 

11410 cgaaacatcattcaggatgcacagactattttcaaggcgagaatctacgtgatagacggaaatagcctgtcaatgtcagctctt 

57 RNI IQOAQTI FKARIYVIDGNSLSMSAL 

11494 aactcgcttttgaagatagcggaagagccacctttaaactgtcatatagccacgactgttgatagcatcaataatgctttacct 

85 NSLLKIAEEPPLNCHIAMTVDS I N N A I* P 

11578 acgcttgcaagtagagcaaaagttctaaccatgctaccttatactaatgaagagaaaatgcagtttgtcaagtcctacaagaag 

113 TLASRAKVLTMLPYTNEEKMQFVKSYKK 

11662 gtagatacttcaggaattgacgaccgagcgattgtagactattgcaatcttgccagcaatcttcaaatgcttgaagacatatta 

141 VDTSGIDDRAIVDYCNLASNLQM-L E^_D- -I L~ 

11746 gaatatggcgcagaagagctatttgaaaaggttacaacattttacgactcaatatgggaggcaagtgctag^aattcgctaaag 

169 EYGAEELFEKVTTFYDLIWEASASNSLK 

11830 gttactaattggctcaaatttaaggaaactgatgaaggaaaaattgagcctaaacttttcctcaactgtcttttaaattggtcg 

197 VTNWLKFKETDEGKI E PKLFLNCLLNWS 

11914 acagttgtcatcaggaagcactatgtagaaatgtctttcgaagaacttgaggcccatgaccttttagtgagggaagcatctagg 

225 TVVIRKHYVEMSFEELEAHDLLVREASR 
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11998 tgtttgcgaaaggtatctaaaaagggctcaaatgcgcgtgtctgcgtgaacgaatttatcaggagggtcaaacaagttgagtga 
12081 

2S3 CLRKVSKKGSNARVCVNEFI RRVKQVE * 
dplOR?018 

35847 atggctagcagacagacgctattggtcgacggaattgaccttgtcgacaaaggtgcaaccgtgctagaatatgtaggactcact 

1 MASRQTLLVDGIDLVDKGATVLEYVGLT 

35931 ttcgcaggatttaaggactcaggatttaaaaaccctgaaggcatagacggagtattagattctccgtctaatgctatgtccgct 

29 FAGFKDSGFKNPEGIDGVLDS P SNAMSA 

36015 cttactggaagcgtgaccttaatgttccacggagaaaccgaaaagcaagttaatcaaaaatacaggcagttcaaacaatttatt 

57 LTGSVTLMFHGETEKQVNQKYRQFKQFI 

36099 cgctcgaagtcattttggagaatttcgacacttgaagaccctggatactatcgaacgggaaaatttttaggagaaaccgagcaa 

85 RSKSFWRISTLEDPGYYRTGKFLGETEQ 

36183 ggaaaacttgtagacgttcaagcctttaaagatacttcccttgtagttaaattagggattcagttcaaagatgcttacgagtac 

113 GKLVDVQAFKDTSLVVKLGIQFKDAYEY 

36267 agcgactcaactgttcgaaaggtttataagtttcaacccgctttgggaggcgatagcttacctaacccaggaagacctactcga 

141 SDSTVRKVYKFQPALGGDSLPN PGRPTR 

36351 caatttagagtagaaataagaactacttctcaaatcaaaggatattttcgaattggcgaaaaaagttcaggacagtttgttgag 

169 QFRVEIRTTSQIKGYFRIGEKSSGQFVE 

36435 ttcggtactaattcagtattgatggaaagtggctcgattattattctaaatcttggaacttttgaacttattaaaattagcagt 

197 FGTNSVLMESGSI IILNLGTFELIKISS 

36519 gcaaatcaagcgactaacttatttagatacattaaacgaggcgcattcttcaagattcctaatggaaattcaacaattaccatt 

225 ANQATNLFRYIKRGAFFKI PNGNSTITI 

36603 gaataccgagccgatgacgcagcagcttggacctctacccttcccgctcaagttgaactgtttctaaatccgtcttactattag 
36686 

253 EYRADDAAAWTSTLPAQVELFLNPSYY* 
dplORF019 

12161 atgaatgtttatctcaatcaaatgggaaatgtagttcgagaaacttcggtttcaacagtctggaaaaccctcactcaaaaaggg 

1 MNVYLNQMGNVVRETSVSTVWKTLTQKG 

12245 ctcgtttctaatcatcgaatattcgctgttcgagatgacaaggagtttctgtctaatgagtcgaggtggaaaaggcttccggat 

29 LVSNHRI FAVRDDKEFLSNESRWKRLPD 

12329 gttagatatgggacacttgttttgatggttactaaaattgacaagcgaagcaagttgctaaaggcctttcctgataattgtgtt 

57 VRYGTLVLMVTKIDKRSKLLKAFPDNCV 

12413 gagtttgagaaaatgactgacgcgcagttgaaaaggcattttgtgtctaaatactcgactattgatagcgacatgattgacatg 

85 EFEKMTDAQLKRHFVSKYST IDSDMIDM 

12497 gt tat ccagttctgtctaaacgat tact ctagaattgacaatgaattggacaagctgtcgcgattgaaaaaggttgacgcatca 

113 VIQFCLNDYSRIDNELDKLSRLKKVDAS 

12581 gtagttgaatccattgtcaagcacaagaccgaaattgacattttcagcctagttgatgatgtattggaatataggccggagcag 

141 VVESIVKHKTEIDIFSLVDDVLEYRPEQ 

12665 gcaattatgaaagtgactgaacttttagccaaaggagaaagtcctattggattgcttaccttgctttatcaaaattttaataac 

169 AIMKVTELLAKGESPIGLLTLIiYQNFNN 

12749 gcttgtcttgtgctaggagccgatgagcctaaagaagccaatctaggcattaagcagttcttaatcaataagattgtctataac 

197 ACLVLGADE PKEANLG I KQF L I NKIVYN 

12833 tttcaatacgagctggactcagcctttgaaggcatggctattttaggtcaagctatcgagggcataaagaatggtcgctataca 

225 FQYELDSAFEGMAILGQAIEGI KNGRYT 

12917 gaaagttcagtggtictatatttctttgtataaaattttttcacttacttaa 12967 

253 ESSVVYISLYKIFSLT* 
dplORF020 

1664 atggttaatcaatacaatcagcctgaaagaggcaagattcgaatcaatgttcgcgaccctgagaaaatgcctatcatggaaatt 

1 MVNQYNQPERGKIRINVRDPEKMPIMEI 

194 8 ttcggtcctacaattcaaggtgaaggaatggttataggtcaaaagactattttcattcgaactggtggatgcgactatcattgc 

29 FGPTIQGEGMVIGQKTIFIRTGGCDYHC 

2032 aactggtgtgactcagcctttacctggaacggtactactgagccggaatatatcacaggcaaagaagctgctagtcgaatcttg 

57 NWCDSAFTWNGTTEPEYITGKEAASRIL 

2116 aaactagctttcaatgataaaggtgaacagatttgtaaccacgtgacattgactggaggaaatcctgccttaatcaacgagcct 

85 KLAFNDKGEQICNHVTLTGGNPALINEP 

2200 atggctaagatgatttcgattctaaaagaacatggattcaagtttggtctcgaaactcaaggaactcgattccaagaatggttc 

113 MAKMI SILKEHGFKFGLETQGTRFQEWF 

2284 aaagaagtaagcgatatcactattagtcctaaaccgccttcaagtggaatgagaactaatatgaaaattcttgaagctattgta 

141 KEVSDITISPKPPSSGMRTNMKILEAIV 

2368 gatagaatgaatgatgaaaaccttgactggtcatttaaaatcgttatctttgacgaaaatgacctagcttatgcgcgtgatatg 

169 DRMNDENLDWSFKIVI FDENDLAYARDM 

2452 tttaaaactttcgaaggcaagttacgtccagtgaactacctttcagttgggaatgcaaacgcatacgaagaaggaaaaatcagt 

197 FKTFEGKLRPVNYLSVGNANAYEEGKIS 

2536 gataggcttcttgaaaagttgggatggctttgggataaagtgtatgaagacccagctttcaacaatgttcgacctttaccgcaa 

225 DRLLEKLGWLWDKVYEDPAFNNVRPLPQ 

2620 cttcatacacttgtttatgataataaaagaggagtataa 2658 

253 LHTLVYDNKRGV* 
dplORF021 

2504 atgcaaacgcatacgaagaaggaaaaatcagtgataggcttcttgaaaagttgggatggctttgggataaagtg^tg»agaca 

1 MQTHTKKEKSVIGFLKSWDGFGIKXMKT 

2588 cagctttcaacaatgttcgacctttaccgcaacttcatacacttgtttatgataataaaagaggagtataaaatgaaaattgag 

29 QLSTMFDLYRNFIHLFMI IKEEYKMKIE 

2672 catctagataaaatcggtaacgtattagggagagagaacggatgggcttcccttaagccggatgaaattgtaaccttggacaat 

57 HLDKIGNVLGRENGWASLKPDE IVTLDN 

2756 actgaggcagccgttcaaagactttttggtctattaggcgaggacgcagaacgtgacgggttgcaagatactccattccgtttt 

85 TEAAVQRLFGLLGEDAERDG LQDTPFRF 
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284 0 gttaaagcactcgctgaacataccgtagggtatcgagaagaccctaaacttcatctcgaaaaaacattcgacgtcgaccatgaa 

113 VKALAEHTVGYREDPKLHLEKTFDVDHE 

2924 gaccttgttcttgtgaaagacattccattcaattctttatgtgagcatcatttagctccgttcgtagggaaggtgcatattgca 

141 DLVLVKDIPFNSLCEHHLAPFVGKVHIA 

3008 tacattcctaaggataagattacaggtctttcaaaattcggtcgagtggttgaaggatacgctaaacgacttcaagtacaagag 

169 YIPKDKITGLSKFGRVVBGYAKRLQVQE 

3092 cgcttgactcaacaaatcgctgacgctattcaggaagttctaaatcctcaagcagttgcggtcatcgtagaggctgagcatact 

197 RLTQQIADA IQEVLNPQAVAVI VEAEHT 

3176 tgcatgagcggacgcggtattaagaagcacggggcaacgacagtgacttcaactatgcgaggtcttttccaagatgacgcatct 

225 CMSGRGIKKHGATTVTSTMRGLFQDDAS 

3260 gctcgagcagaattgcttcagttgattaaaaagtag 3295 

253 ARAELLQLI K K * 
dplORF022 

30896 atgagtaaagacattctttacggaatcaagctcgtgcaaatcgaggagcttgacccattgactcagttgccaaaagtcggcgga 

1 MSKDI LYGI KLVQI EELDPLTQLPKVGG 

30980 gctaactttgtcgtagatacggcagaaacagcagaactcgaagccgtgacctcggagggaactgaagatgtgaaacgcaatgac 

29 ANFVVDTAETAEfcEAVTSEGTEDVKRND 

31064 acgcgcactcttgctatcgtgcgtactccagaccttttatacggttatgacttaacattcaaggacaacacgtttgaccctgaa 

57 TRILAIVRTPDLLYGYDLTFKDNTFDPE 

31148 atcatggccctaattgaaggtggtacagtacgtcaacaaggcggaactattgctggatacgacaccccaatgcttgcacaaggt 

85 I MAL I EGGTVRQQGGTI AGYDT PMLAQG 

31232 gcttctaatatgaaaccatttagaacgaacatctatgtgccaaactatgtaggtgactcaattgtcaactacgtgaaaatcact 

113 ASNMKPFRMNIYVPNYVGDS IVNYVKIT 

31316 ttgaataactgtaccggtaaagctccagggctttcaatcgggaaagagttctacgctcctgagttcaacatcaaggcacgtgaa 

141 LNNCTGKAPGLSIGKEFYAPEFNIKARE 

31400 gcaaccaaagcaggtttgccagtcaagtcaatggactatgtggcacaacttccagcggttcttcgtcgcgtgacattcgatttg 

169 ATKAGLPVKSMDYVAQLPAVLRRVTFDL 

31484 aacggtggaacaggaaccgccgacgcagttcgagttgaagcaggtaagaagatttctccaaaaccagttgaccctaccttaaca 

197 NGGTGTADAVRVEAGKKIS PKPVDPTLT 

31568 ggtaaggctttcaaaggctggaaagttgaaggagaatcaactatttgggacttcgacaaccacatgatgcctgaccgagacgtc 

225 GKAFKGWKVEGESTIWDFDNHMMPDRDV 

31652 aaactcgtagcacaatttgcatag 31675 

253 KLVAQFA* 
dplORF023 

6419 atggccaagtccaatttaactagaattgcaaagatggttagagcaggaaacagtgaaggtcctgcttcatcttttgtcaattcg 

1 MAKSNLTRIAKMVRAGNSEGPASSFVNS 

6503 ctgacccgggttattgaacgaactcagcctgaatataatccttcgacatattataagcccagcggggttggtggatgtattcga 

29 LTRVI ERTQ PEYNPSTYYK PS GVGGC I R 

6587 aaaatgtatttcgaaagaatcggtgagtctattatagataacgcagattctaacctaattgcaatgggcgaagctggaacattt 

57 KMYFERIGE5I IDNADSNLIAMGEAGTF 

6671 aggcacgaagttctccaagagtacatggttaaaatggctgaaatcgatgaggactttgaatggttgaatgtagcagagttcttg 

85 RHEVLQEYMVKMAEIDEDFEWLNVAEFL 

6755 aaagaaaatccagttgaaggaactatcgtcgacgagcgtttcaagaaaaacgattatgaaacgaagtgtaagaacgaacttctt 

113 KENPVBGTIVDERFKKNDYETKCKNELL 

6839 caactttcattcttgtgtgacggactagttcgatataaaggcaagctctacattttagagattaagactgaaaccatgttcaag 

141 QLSFLCDGLVRYKGKLYILE I KTETMFK 

6923 ttcactaaacatactgagccctacgaagaacacaagatgcaagcaacttgctacggaatgtgtctaggagtcgatgatgtcatt 

169 FTKHTEPYE EHKMQATCYGMCLGVDDVI 

7007 ttcctttatgaaaatcgagataacttcgaaaagaaagcctacacgtttcacatcacagacgagatgaaaaatcaagtccttgga 

197 FLYENRDNFEKKAYTFHITDEMKNQVLG 

7091 aaaattatgacctgcgaagagtacgtagagaaaggcgaaagtcctaaaatctattgctcttcagcctattgcccatattgtaga 

225 KIMTCEEYVEKGESPKIYCSSAYCPYCR 

7175 aaggaaggtcgaaatctgtga 7195 

253 K E G R N L * 
dplORF024 

25992 atgaacgcagtagatggccaggtagttcatattctacaagtattagcagaagatggaaatgctacggctgaaaagttcgaaaag 

1 MNAVDGQVVHILQVLAEDGNATAEKFEK 

26076 gaagtcagggctgcatctttagtattttcacgaagagcagccgaggcagttgtcaaaggtgaaatctataaggacggcaaaaac 

29 EVRAASLVFSRRAAEAVVKGE I YKDGKN 

26160 ctctcgaaacgtgtttggtcttcagccgcacgcgcaggaaatgatgttcaacaaatagtcacacaaggcctagcaagtggaatg 

57 LSKRVWSSAARAGNDVQQIVTQGLASGM 

26244 tctgctacagatatggctaaaatgctcgagaaatatatcgaccctaaggttcgaaaagattgggactttgataagatagctgag 

85 SATDMAKMLEKYIDPKVRKDWD FDKI AE 

26328 aagctagggaaacctgctgctcataaatatcaaaatctcgaatacaatgcccttcgacttgctcgaactaccattagccattcc 

113 KLGKPAAHKYQNLEYNALRLARTTISHS 

26412 gccacagctggagtgagacaatggggcaaggttaatccttatgctcgaaaagttcaatggcattccgttcacgctccaggtcga 

141 ATAGVRQWG KVNPYARKVQWHSVHAPGR 

26496 acgtgtcaagcgtgtatcgatttagatggtgaagtatttcctatcgaagaatgtcctttcgaccatcctaatggaatgtgctac 

169 TCQACIDLDGEVFPIEECPFDH PNG--MC-Y, 

26580 caaactgtatggtacgaaaactcactcgaagaaatcgctgatgagctgagaggctgggcagacggagaacctaatg^tgtatta 

197 QTVWYENSLEEIADELRGWVDGEP-NDVL 

26664 gacgaatggtacgacgatttaagttcaggaaaagttgagaaatacagcgacctcgactttgttaaaagttattag 26738 

225 DEWYDDLSSGKVEKYSDLDFVKSY* 
dplORF025 

18778 atggcaaagaacaaaaagcgaaaaaaagtaaatgtcaaaaggaaaatgcttatccctacaaatctctcgaaaaaagtaaatgta 

1 MAKNKKRKKVNVKRKMLI PTNLSKKVNV 



WO 00/32825 



PCT/IB99/02040 



374 

1B694 aaagcaatcgcttatagaaaagtcactgttaagtggctgcctaatacagatgaaattcaagtatatttcgacctttatataaat 

29 KAIAYRKVTVKWIiPNTDEIQVYFDLYIN 

18610 aaaaacaggctgacaatgttaggcactattgacccggacaagagctattttgaaggaattaggattgtttgtaagaaacctcag 

57 KNRLTMLGTIDPDKSYFEGIRIVCKKPQ 

18526 ccttggatgactgttaaggagctccaggttgcgcgtgcagacgccccaggtttttctgcagttcctaaagcctattgtcacacg 

85 PWMTVKELQVARADAPGFFAVLKAYCHT 

18442 gttggcgatgtactagatagcggagcagagcctactgaaattgttcaaggtattatgtataaagacggtgaactatttaaggac 

113 VGDVLDSGAEPTEIVQGIMYKDGELFKD 

18358 agtgaaattgtcagccttttcaaatacgatgtcaaagagccttatgagtttccaaaggaccttcctataaccttggacaacttt 

141 SEIVSLFKYDVKEPYEFPKDLPITLDNF 

18274 ttagagttcattatgtctagccagcatactagagcacttgttttgcgttgtgctaatataggtgagttttccaagaattggcgg 

169 LEFIMSSQHTRALVLRCANIGEFSKNWR 

18190 aaatggcaaaaagctatccagctcctgctcgactatgccaaggcggatgactttaaagtagacgaaactgtttgggacttttca 

197 KWQKAIQLLLDYAKADDFKVDETVWDPS 

18106 cccggctctaaagctggaaaggtagcacgtcgtaaaggctatgaggcaattcaacaagcccttgagcagataaataaataa 
18026 

225 PGSKAGKVARRKGYEAIQQALEQINK* 
dplORF026 

21512 atggcgaaagctactggaccaaaagttcgaagaggaaaaactcctccacggccaaaagacaaaaaaggaatcaaagcaaatgcg 

1 MAKATGPKVRRGKTPPRPKDKKGIKANA 

21596 cgtgtcaataaagaccagttcgtagagtatgactataaaggcatcaagatgacaattaaggaacgtgatgctagaatgaaattg 

29 RVNKDQFVEYDYKGIKMTIKERDARMKL 

21680 gaatttattagaggcatgactattcaggaaattgcagcccgctatggattaaatgaaaagcgtgttggcgaaatacgggctcgc 

57 EFIRGMTIQEIAARYGLNEKRVGEIRAR 

21764 gataaatgggtgaaggctaagaaagagttcgagaatgaaaaggctcttgttactaatgatacattgactcaaatgtatgcaggg 

85 DKWVKAKKEFENEKALVTNDTLTQMYAG 

21848 tttaaagtctcagtcaatattaaatatcacgccgcctgggagaaactaatgaacatcgtcgaaatgtgtttagataatcctgac 

113 FKVSVNIKYHAAWEKLMNIVEMCLDNPD 

21932 agatatttatttactaaagaaggaaatattagatggggcgcattagatgtcctttcgaaccttatagatagagctcaaaaagga 

141 RYLFTKEGNIRWGALDVLSNLIDRAQKG 

22016 caagaaagagcgaatggaatgcttccggaagaggttcgatatagactacaaattgagcgcgagaaaattacattgctccgggcc 

169 QERANGMLPEEVRYRLQI EREKITLLRA 

22100 aaaatgggcgaccaggaaattgaaggcgaggttaaagataacttcgtagaagcactagataaagcagctcaagccgtttggcaa 

197 KMGDQEIEGEVKDNFVEALDKAAQAVWQ 

22184 gaatttagtgacgcaacaggttcctacattaaaggagtgactgataatgacaataagcctgagaaataa 22252 

225 EFSDATGSYIKGVTDNDNKPEK* 
dplOR7027 

52762 atgggaaaagtatcaattcaaaaatcaggaacatttagctcagggtctaataacgagtttttcacactcgctgaccacggtgac 

1 MGKVSIQKSGTFSSGSNNEFFTLADHGD 

52646 agcgcaattgtcactctattgtatgatgacccggaaggcgaagacatggattatttcgtagtccacgaagcagacgttgacggt 

29 SAIVTLLYDDPEGEDMDYFVVHEADVDG 

52930 cgtcgacgctatatcaattgcaatgctattggcgaagacggggaaacagtccatcctgataattgtccattatgccaaaacgga 

57 RRRYINCNAIGEDGETVHPDNCPLCQNG 

53014 ttccctcgtattgaaaaactatttcttcaactttacaaccatgatacgggaaaagttgaaacatgggaccgaggccgttcttat 

85 FPRIEKLFLQLYNHDTGKVETWDRGRSY 

53098 gttcaaaagattgttacatttatcaataaatatggaagccttgtgactcagccttttgaaattattcgttcaggagctaaaggt 

113 VQKIVTFINKYGSLVTQPFBI IRSGAKG 

53182 gaccaacgaactacttatgaattccttccagagcgtccggaagacagtgctactcttgaagattttccagaaaagagcgaactt 

141 DQRTTYEFLPERPEDSATLEDFPEKSEL 

53266 c t tggaa ctctaattttagacctcga cgaagac caaa t g 1 1 1 gacg tgg 1 1 gacgg caag t tcactcttc aagaagagcgt t c t 

169 LGTLILDLDEDQMFDVVDGKFTLQEERS 

53350 tcaagtcgttcaaattcacgtagaggagcatctcctgcgcctagacgaggttccggtcgagaatcttcacaaggtcgaacagct 

197 SSRSNSRRGASPAPRRGSGRESSQGRTA 

53434 gaaagaactccttcagttagtcgaagaactcctccaacacgaggtcgaggattctaa 534 90 

22S ERTPSVSRRTPPTRGRGF * 
dplORF026 

44 595 atgtcaaaaattaaattcgaaaaccttaaaaaaggcgatgttgtgctacgagctaaatctcaaacgaagtttaaaatcgtttca 

1 MSKIKFENLKKGDVVLRAKSQTKFKIVS 

44679 attttagcagacgaaaagaaagcagaccttgaatcattagaagacggaggtgaacttcacctttcagcttcaactctcgaacgt 

29 ILADEKKADLESLEDGGELHLSASTLER 

44763 tggtacacaatggaagatgaaactgaacctaaaaaagaagaagctgctaaacccgctaaaaaggctgctcctgcagttgctcga 

57 WYTMEDETEPKKEEAAKPAKKAAPAVAR 

44847 cctgctcgaaaaggeagagtcgttcccaaacctaaaaaagaagtccttgaggaagaaattcctgaagttaaggaacagccggaa 

85 PARKGRVVPKPKKEVLEEE I PEVKEQPE 

44931 gaagttggttcagttagtgagaaatctactgttcgaaaacctgctcctaaaaaagaaagcgtgatggcgattactaaggctctt 

113 EVGSVSEKSTVRKPAPKKESVMAITKAL 

45015 gaaagtcgaattgttgaagcctttcctgcgtctactcgaatcgtcactcagtctcacatcgcctatcgctctaagaagaacttc 

141 ESRI VEAFPASTRI VTQSYIAYRSK KN F 

45099 gttactatcgaagaaactcgaaaaggCgtttctactggagttcgcgcaaaagggtcgacagaagaccaaaagaaacMictt^ca 

169 VT I EETRKGVS IGVRAKGLTEDQK^KL LA 

45183 tctattgctcctgcatcttacgaatgggcgattgacggaatttttaaactcgtcaaggaagaagatattgacaccgcaatggaa 

197 S IAPASYEWAIDGI FKLVKEEDIDTAME 

45267 ttgattgaagcttcccacctttcttcgctatga 45299 

225 LIEASHLSSL* 
dplORP029 
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662 atqaaatcagtagttttattatccggcggagtcgacccagccacccgtttagcaattgaagttgacaagtggggttctaaaaat 

I M K SVVLLSGGVDSATCLiAI EVDKWGSKN 

746 qttcacgctatagcattcaattacggacaaaagcatgaagcagaacttgaaaatgctgctaatgttgcaatgttctacggagtc 

29 vHAIAFNYGQKHEAELENAANVAMFYGV 

830 aaattcaccattcttgaaattgactcgaaaatctacccaagctctagctcttccttattacaaggaaaaggcgaaatttcacat 

57 K F TILEIDSKIYSSSSSSLLQGKGEISH 

914 qqaaaatcttacgctgaaatcctagcagagaaggaagtagttgacacctatgttccatttagaaatggactaatgctttcacag 

85 G KSYAEILAEKEVVDTYVPPRNGLMLSQ 

998 gctgcggcttatgcttattcggttggagctccttacgtcgcatatggtgctcacgcagacgatgcggctggaggtgcttaccct 

113 A A A YAYSVGASYVVYGAHADDAAGGAYP 

1082 qattqcactcctgagttctataattcaatgtcaaatgcaatggaatatggaactggaggcaaggtaacccttgtcgctcctcta 

141 DC TPEFYNSMSNAMEYGTGG KVTLVAPL 

1166 cttactctaaccaaggcgcaagtcgttaaatggggaattgatttagatgttccttatttcttgactcgttcatgttatgaaagt 

169 LTLTKAQVVKWG IDLDVPYFLTRSCYES 

1250 qacgctgaaagttgtggaacttgcgcaacttgtatcgaccgcaaaaaggcattcgaagaaaatggaatgactgaccctattcat 

1 97 DAES CGTCATCI DRKKAFE ENGKTDPIH 

1334 tataaggagaattga 1348 

225 Y K S N * 

^OOBS* 030 atgaataacgaaaaaattattgaaaaaattaaaaatcttattcaattagcaaatgacaacccgagtgacgaagaggggcaaact 

1 MNNEKI IEKI KNLIQLANDNPSDEEGQT 

20004 gcccttcttatggctcaaaagttgatgctaaagaataatatcgcacttgctcaagttgaacaatttgatgaacctaaacagttc 

29 allMAQKLMLKNNIALAQVEQFDEPKQF 

19920 gagacctctcaagctgttgggaaagaagcaggtcgaatattttggtgggaacgtgaacttggtcatattctcgcgactaatttt 

57 ETSQAVGKEAGRIFWWERELGHILATNF 

19836 aggtgcttttgtattaatcagcgtgatatgcgcttgaataaaagtcgaataattttcttcggcgaaaaacaagacgctgaatta 

85 RCFCINQRDMRLNKSRI I FFGEKQDAEL 

19752 gtgcctaaaatatatgaggctgctttgctttatcttcgttaccgtattgaccgacttcctactcgcgaaccttcctacaagaat 

113 ySKIYEAALLYLRYRIDRLPTREPSYKN 

19668 tcatacctcaaaggctttttgtcagccttagccattcgatttaaaaagcaggtggaagaatattcacttatggtcctacctagc 

141 SYLKGFLiSALAIRFKKQVEEYSLMVLPS 

19564 gagcaaacaaaaaatgcgcttcaggacacatttcgaaatttaaagaaggaaggaattgacagacctcaacatgacttcaatctt 

169 eqtKNALQDTFRNLKKEGIDRPQHDFNL 

19500 gaagcgtatattgaagggcggtttcatggcgagaatgcaaagattatgcccgatgaaattttggaaggcggtaactaa 19423 

197 EAYIEGRFHGENAKIMPDEIliEGGN* 



dplORF031 

26943 atggcttatcaattagaagacttgttaaaaggtctagacgaaccaactatcaaacaggtgaaggaaattatttcgaaaacttcg 

1 MAYQLEDLLKGLDEPTIKQVKEI ISKTS 

27027 aaagaactcgatgctaaaattttcattgacggcgacggtcaacattttgtacctcacgcacgtttcgatgaagttgttcaacag 

29 keldakifidgdgqhfvpharfdevvqq 

27111 cgcqatgcagctaacggctcaattaattcttataaagaacaagtcgcgacgctttctaaacaggtcaaagataacggtgatgcg 

57 R DAANGSINSYKEQVATLSKQVKDNGDA 

27195 cagaccactatccaaaaccttcaagagcaactcgacaagcagtctcaacttgcaaaaggcgctgtgattacttcagctcttcat 

85 qttiqnLQEQLDKQSQLAKGAVITSALH 

27279 ccgttgattagtgactccattgctccagcagcagacattctcggatttatgaaccttgacaacattacggtcgaaagtgacggt 

113 PLISDS IAPAADILGFMNLDN ITVESDG 

27363 aaagttaaaggtcttgatgaagagctgaaagctgttcgtgagtctcgtaaatacttattcaaagaagccgaagttcccgcagaa 

14 1 KVKGLDEELKAVRESRKYLFKEVEVPAE 

27447 caagaggctcaagctaagtcgccagccgggactggaaatttaggaaatccaggtcgtgtcggtggtggtgttcccgaacctcgt 

169 QEAQAKSPAGTGNLGNPGRVGGG VPEPR 

27531 gaaatcggctcttttggtaagcaacttgctgctgctcaacaaacggcaggagcacaagaacaatcatcattctttaaataa 

197 11 EIGSFGKQLAAAQQTAGAQEQSSFFK* 
dplORF032 

52033 atgaaagaagcgaatagactagtttctagctatgtaggattcgaatgctggactgacgaagaatgtatcaggaactttgaacta 

1 MKEANRLVSSYVGFECWTDEEC IRNFEL 

52117 gaccctgatatgtcaattgcgtctgcttatcatcgttattttgggatgctttattcctatgcaaaaaggtttaaatgcttatct 

29 DPDMSIASAYHRYFGMLYSYAKRFKCLS 

52201 cgacatgacattgaaagcattgcattcgagactatttcaaaatgtttggcaacgttcaaatcaaaccaaggggccaagttttca 

57 RHDIESIAFETISKCLATFKSNQGAKFS 

52285 acttaccttacaagactcttcaagaatagaatagccttagaatataggtacctaaatgcaccttccatgaatcgaaattggtat 

85 TYLTRLFKNRIVLEYRYLNAPSMNRNWY 

S2369 gcagaagtgacgttcgacagcgtttcgacaaatgaagaaggcgacgattttagtatcctatcgacagttggctattgtgaagac 

113 VEVTFDSVSTNEEGDDFSI LSTVGYCED 

52453 tacggaaaaattgaaattgaagcaagtcctgacttcatgacgctttctaatacagagtatgcttatatctcgtctgtcatccaa 

1 41 YGKIEI EASLDFMTLSNTEYAYISSVIQ 

52537 aacggtccttcagtaagcgacgcagaaattgcgcgtgaaatcggagtaagcaggtctgctattagtcagtctaagaagtcacta 

169 NGPSVSDAEIAREIGVS'RSAISQS K_-R" S L 

52621 aaaaataaattaaaagattttatataa 52647 

197 KNKLKDFI * 
dplORF033 

7670 acggcaagacctaagttacctcaaattgatattcgagaagaagaaatacgagatgctcaagacgtagcagactcgtatggtgcg 

1 marpkLPQIDIREEEIRDAQDVADSYGA 

7754 attatcaataaagtagtcgacgaaattgttgaagcagcctgcggttcacttgaccaggcaatggaagaaattcaaatagttgta 
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29 IINKVVDEIVEAACGSLDQAMEEIQIVV 

7838 agccaaaatcctgtcattatggaagaccttaaccactacattggctatcttcccactcttctttatttcgccgcagatagggcg 

57 SQNPVIMEDLNYYIGYLPTLLYFAADRA 

7922 gaaatggtgggaatacaaatggattcaagttctgctatcaggaaagaaaaatacgataatctatacattttagccgccgggaaa 

85 EMVGIQMDSSSAIRKEKYDNLYILAAGK 

8006 actattcctgacaagcaagcagaaactcgaaaacttgtcatgaatgaagaagtcatcgaaaatgcttacaagcgagcctacaag 

113 TI PDKQAETRKLVMNEEVI ENAYKRAYK 

8090 aaagttcaattaaagctagaacaggccgataaggtattagcatctttaaaacgaattcaaacctggcaactagcagagttagaa 

141 KVQLKLEQADKVLASLKRIQTWQLAELE 

8174 actcagtcaaataattcaaaaggagtattattaaatgcaaaaagacgtagacgtgaaaatgattga 8239 

169 TQSNNSKGVLLNAKRRRREND * 
dp 1011703 4 

131 atgagtcaaaacactacacgcactgacgctgaattgacaggcgttactcttttaggaaaccaagacaccaaatacgattatgac 

1 MSQNTTRTDAELTGVTLLGNQDTKYDYD 

215 tataatccagacgtccttgaaactttccctaacaaacatcctgaaaataattacctagtaacatttgacggatatgaattcact 

29 YNPDVLETFPNKHPENNYI*VTFDGYEFT 

299 tccctttgccctaaaacaggacagcctgacttcgcgaatgttttcattagttacattccaaacgaaaagatggttgaatctaaa 

57 SLCPKTGQPDFANVFI SYI PNEKKVESK 

383 tcattgaaattgtacttattcagtttccgtaaccacggtgacttccacgaagattgcatgaacattattttgaatgacttgtat 

85 SLKLYLFSFRNHGDFHEDCMN I I LNDLY 

467 gaattgatggaacctaagtacattgaagtcatgggcctattcactccccgtggtggaatttcaacttacccattcgtcaacaaa 

H3 ELMEPKYIEVMGLFTPRGG1S IYPFVNK 

SSI gtgaatcctcaatttgcaactcctgaacttgaacagcttcaacttcaacgcaaattgaacttccttggaaatgttcaaggtctt 

141 VNPQFATPELEQLQLQRKLNFLGNVQGL 

635 ggacgagctattcgatag 652 

169 G R A I R * 
dplORF035 

174 25 atgcacctaatgaaggattcgaagatgttgaggacatggaagtccttagcattcgagttcgaaacgaaggtgaggacgacgagt 

1 MHLMKDSKMLRTWKSLAFEFETKVRTTS 

17341 gggttgaagttatcgcctgctatgaaaacgatgacgaggacgaagatttggaagggttataaaatgaaggtatttatcaacaat 

29 GLKLSPAMKTMTRTKIWKGYKMKVFINN 

17257 catactgaagctgatattgactacaaagatattctaaattttgtagcttatcgaaactctcctaaccctcaaattcaaatcact 

57 HTEADIDYKDILNFVAYRNSPNPQ IQIT 

17173 agctggaacgctttgctttcctgctatacacggaatgagctttcttataaaggagtttcaataacggacttttttgaagccatt 

85 SWNALLSCYTRNELSYKGVS I TDFFEAI 

17089 caaactattgcaagttccttcactcacctagactcgaaaacaattgatacacaaaatgaaaagcgactcgaaaggattgaggaa 

113 QTIASSFTHLDSKTIDTQNEKRLERIEE 

17005 cttcagtcaagaataggtcattgtaactgtactatcgacgaacttaaaaaaggagtccacgaaatgccggatattgaatcagct 

141 LQSRIGHCNCTIDELKKGVHEMPDIESA 

16921 atctcttaccagtacggacagattcttgcttatgaagatgaacttaattttctgctaaactaa 16859 

169 ISYQYGQIIiAYEDELNFLLN* 
dplORF036 

48808 gtgttagtcgaacgaaaagccgacaaggaatgttgggaatggctagaagctgttcgagcaaatatagtcgaagaagttcgaaac 

1 VLVERKADKECWEWLBAVRAN IVEEVRN 

48892 ggtcttagcattgttattgctccgaatactgtcgggaatgggaaaactagctgggcggttcgacttttgcaacgctatttagca 

29 GLSIVIASNTVGNGKTSWAVRLLQRYLA 

48976 gaaactgcacttgacggaagaattgttgagaaaggaatgtttgtagtgtcagctcaactattgactgagtccggcgactataat 

57 ETALDGRIVEKGMFVVSAQLLTEFGDYN 

49060 tattttcaaaccatgcaagaatttctcgaacgtttcgagcgccttaagacttgtgagctattagtcatagacgaaataggtgga 

85 YFQTMQEFLERFERLKTCELLVIDEIGG 

49144 ggttccttaaccaaggcctcttatccttatctgtatgacttggttaattatagggttgacaataacttgtcgactatttatacg 

113 GSLTKASYPYLYDLVNYRVDNNLSTIYT 

49228 actaattatactgacgatgaaactattgaccttttaggccaaaggctttatagtcgtatatatgatacttcagtggttctagat 

141 TNYTDDEIIDLLGQRLYSRIYDTSVVLD 

4 9312 tttcaggcaagcaatgtaagaggattggaggtaagcgaaattgaatcatag 4 9362 

169 FQASNVRGLEVSEIES* 
dplORF037 

55855 atggtgaagaaattgaaatctaaaatctattcagttgcatacataattctagtagttattgcgaaccttgtgacaatttatttc 

1 MVKKLKSKIYSVAYIILVVIANLVTIYF 

SS939 gaacctttaaatgtgaaaggaactttaattcctccaagcagttggtttatgggattcactttcctgcttataaatctaataagc 

29 EPLNVKGILI PPSSWFMGFTFLLINLIS 

S6023 aagtacgagaagccaaaattcgcaggttctttgatatgggtagggttattccttacctcgttgatttgctttatgcaaaaccta 

57 KYEKPKFAGSL1WVGLFLTSLICFMQNL 

S6107 ccacaatcgcttgtcgtggcttcaggagttgcattttggataagtcaaaaagcaagtgtctttacattcgacaagctctcgaat 

85 PQSLVVASGVAFW I SQKAS VF I FDKLSN 

56191 aaattagactcgaagattgcaaatgctttgtctagcaacatcggttctattatagacgcaaccatatggatttcattaggactg 

113 KLDSKIANALSSNIGSIIDATIWISLGL 

56275 agtcctcttggaattggaacggttgcatatatagatattccgtcagccgtactaggccaagttctagttcagtttatcttgcag 

141 S P LG I GTVAY I D I P SAV LG QV L V Q F I L Q 

56359 ccaattgcttcgagatatttgaaaaagtag S6388 - •* 

169 SIASRYLKK* - 
dplORF038 

1350 atgagagtttctaaaaccttaacattcgacgcagctcatcaactagttggacattttggaaaatgcgcaaatttgcacgggcat 

1 MRVSKTliTFDAAHQLVGHFGKCANLHGH 

1434 acttacaaagtcgaaatttcatcagcaggcggaacttatgaccacggttcgagtcaagggatggttgttgacttttatcacgtc 

29 TYKVE ISLAGGTYDHGSSQGMVVDFYHV 
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1518 aagaaaatcgcaggtacattcattgacagacttgaccacgctgttcttcttcaagggaatgaaccaatcgctttagcaaatgca 

57 KKIAGTF1DRLDHAVLLQGNEPIALANA 

1602 gttgacaccaagcgagttctatttggatttagaactacggctgagaatatgtcaagattccttacctggactctcacggagctt 

85 VDTKRVLFGFRTTAENMSRFIiTWTLTEL 

16 8 6 atgtggaagcacgctcgtatcgactctatcaaactatgggaaactcctacaggttgcgcagaatgtacttactacgagattttc 

113 MWKHARIDS I KLWETPTGCAECTYYEIF 

1770 acagaagacgagattgaaatgttcaagaacgtaacctttatcgacaaagacgaaaagattactgtccgcgaaattttagagcag 

141 TEDEIBMFKNVTFIDKDEKITVREILEQ 

1854 gagcaggataatggttaa 1871 

169 E Q D N G * 
dplORF039 

3306 atgaataaaagtgcaacctttcggcttgttcgaacagctcttattgcggctctatatgtgacattgaccgttgcattttctgct 

1 MNKSATFWLVRTALIAALYVT LTVAFSA 

3390 atcagttatggacctattcaatttagagtcagtgaagccttgattcttctacctttatggaaccacagatggactccggggatt 

29 ISYGPIQFRVSEALILLPLWNHRWTPGI 

3474 gtattaggaacaattattgcaaacttcttttcacctcttggactgattgacgttttattcggtccacttgctaccttccttgga 

57 VLGTI IANFFSPLGLIDVLF.GSLATFLG 

3558 gtagtggcaatggtgaaagttgctaagatggcaagtcctctatattcacttatctgtccagctcttgctaatgcttaccttatt 

85 VVAMVKVAKMASPLYSLI CPVLANAYLI 

3642 gcgctggaacttcgaatagtttactctttacctttttgggaatctgtcatctatgtaggaattagtgaagcgattatcgtttta 

113 ALELRIVYSLPFWESVIYVGISEAIIVL 

3726 atttcatacttccttatttccacgctggcgaagaacaatcattttagaacactgataggagcgaaaaatgggatttaa 3803 

141 ISYFLISTLAKNNHFRTLIGAKNGI* 
dplORF040 

7192 gtgagctatactggaaaaatgttcgaggaagactttttcgaaggtgcaaaagactttgagaaagatgctttcacggtccgtcta 

1 VSYTGKMFEEDFFEGAKDFE KDAFTVRL 

7276 tatgataccactaatggatttcgaggagttgcaaatccctgcgattatatagccgcaactaactttgggaccttgtttattgaa 

29 YDTTNGFRGVANPCDYIAATN FGTLFIE 

7360 ctgaaaactactaaagaagcttctttgagctttaataacatcactgataatcaatggttccagctatcacgcgcagatggatgc 

57 LKTTKEASLS FNNITDNQWFQLSRADGC 

7444 aaatttattctcgccggaattttagcgtatttccaaaagcatgaaaagattatatggtatccaatttcaagccttgaaaaaatt 

85 KFILAGI LVYFQKHEKI IWYPI SSLEKX 

7528 aaacggtctggagttaaaagcgccaacccaaacttcatcgatgcagggtatgaagtttcttacaagaagcgtcgaactagattg 

113 KRSGVKSVNPNFIDAGYEVSYKKRRTRL 

7612 acc a 1 1 c c 1 1 1 c caaaa tg 1 1 c t aga t gc agt tgagc t c ca c t acaaggagaaaagca at ggcaagac c t aa 7683 

141 TIPFQNVLDAVELHYKEKSNGKT* 
dplORF041 

8208 atgcaaaaagacgtagacgtgaaaatgattgaccctaaacttgaccgattaaaatacacaggtgattgggttgatgtacgaatt 

1 MQKDVDVKMIDPKLDRLKYTGDWVDVRI 

8292 agttctatcactaaaattgacgccgacagcgccgatgtctcaagatgtcgaaaagtgcttcaaaaggctcaagtatattcagtg 

29 SSITKIDADSADVSRCRKVLQKAQVYSV 

8376 gcggcaggtgaatgcattaaaattgcacacggatttgctcttgaacttcctaagggatatgaagcaatcttgcatcctcgttcc 

57 AAGECIKIAHGFALELPKGYEAILHPRS 

8460 agtctttttaagaaaactggtctaatcttcgtttctagcggagtgattgacgaaggttacaaaggtgacactgatgaatggttc 

85 SLFKKTGLI FVSSGVIDEGYKGDTDEWF 

8544 tcagtttggtatgctactcgtgacgcagatatcttctacgaccaaagaattgcccaatttagaattcaggaaaagcaacctgct 

113 SVWYATRDADIFYDQRIAQFRIQEKQPA 

8628 atcaagttcaatttcgtagaatctttaggaaatgcggctcgtggaggccatggaagtacaggtgatttctaa 8699 

141 IKFNFVESLGNAARGGHGSTGDF* 
dplORF042 

48082 gtggcaaggcaaagaataggcaattcaggaaagcctaaaaatgaaattgaactaacattcaaagacaagcctaaaactcgttct 

1 VARQRIGNSGKPKNEIELTFKDKPKTRS 

48166 accttattcaagaaggacgtggcaacaggtctttcaaaagtcgagcatgattattttcaaatagttgaagcacttaacggaaaa 

29 TLFKKDVATGLSKVEHDYFQIVEALNGK 

48250 caattcgaacctaatatgaagcaggtgtcatcttcctttatagttcagtatgaatttaccttcaatattaagtgcatcgattat 

57 QFEP. NMKQVSSFFIVQYEFI FNIKCIDY 

48334 aactggttcaacttttcgagcactatgaaaaatgttcgaacttatttaaacattgagtcgaacattgaactttgtcgattttta 

85 NWFNFSSTMKNVRTYLN1ESKIELCRFL 

48418 gctgaaagttttgttaaatatgaaaatgttcgaaaaagattgaacctaagcgaaaggttcataacggtctcgactttcaaaaga 

113 AESFVKYENVRKRLNLSERF I TVSTFKR 

48502 gcctggattttggacgaactcgaaggaaaaacgggttcaaaattcgaaggattttattag 4 8561 

141 AWILDELEGKTGSKFEGFY* 
dplORF043 

31699 atgactaatattatcacagctgagcagtttaagcaacttgcatttcaaatcategcacttccaggattttcaaaaggtagtgaa 

1 MTNIITAEQFKQLAFQI IALPGFSKGSE 

31783 cctatccatgttaaaattcgagcagcaggtgtcatgaacctaatcgctaacgggaaaatccctaatacgcttttaggtaaagtg 

29 PIHVKIRAAGVMNLIANGKI PNTLLGKV 

31867 acagaactgtttggagaaacttcgacagtcactaaagacaatgctagtctagcatcaattactgaccaacagaagaaagaagcg 

57 TELFGETSTVTKDNASL AS I TDQTQ K_^K**E A 

31951 ctcgaccgattgaacaaaaccgataccggtattcaagacatggctgaacttcttcgagtattcgcagaagsttcaatggtagag 

85 LDRLNKTDTGIQDMAELLRVFAEASMVE 

32035 cctacttacgctgaagtcggcgagcatatgacagatgagcaacttatgacaatcttcagtgcaatgtacggtgaagtgactcaa 

113 PTYAEVGEYMTDEQLMTI FSAMYGEVTQ 

32119 gctgaaacctttcgtacagacgaaggaaatgtctaa 32154 

141 AETFRTDEGNV* 
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dplORF044 

25666 atggtaagtgttttgattagcagcagctcctttttgaagttcctgcttcattttagctcgacaagtatttctaaatcgaataag 

1 MVSVLISSSSFLKPLLHFSSTSISKSNK 

25582 gttttcaatttccttgtttcctacataagtggtgaaccgataatggcacttaggacattcgaagaatctccactctacgccctt 

29 VFNFLVSYISGEPIMALRTFEESPLYAL 

25498 ttcgatatgtttcgaaataatctgtttagatgtaaggtcgaacttatgctcacaatggtcacaattaaccttgaacgtctgggt 

57 FDMFRNNLFRCKVELMLTMVTINLERLG 

25414 cgactccttcttcggttggttgttcagtttgttctttttctttgtcatcaacttcgtcttcttcactcgttccatcttgaggct 

85 RLLLRLVVQFVLFLCHQLRLLHSFHLEA 

25330 cctcttgttcgtttaactcgtttgctaatacaggcaatgctccagctgagatttcgtcaagctgagcaagttcttccaaaatgc 

113 PLVRLI RLLIQAMLQLRFRQAEQVLPKC 

25246 gttcccattccttgtccgccttttccttcttactga 25211 

141 VPIPCPPFPSY* 
dplORF045 

25340 atgaaacgagtgaagaagacgaagttgatgacaaagaaaaagaacaaactgaacaaccaaccgaagaaggagtcgacccagacg 

1 MKRVKKTKLMTKKKNKLNNQPKKESTQT 

25424 ttcaaggttaattgtgaccattgtgagcataagttcgaccttacatctaaacagattatttcgaaacatatcgaaaagggcgta 

29 FKVNCDHCEHKFDLTSKQI ISKHIEKGV 

25508 gagtggagattcttcgaatgtcctaagtgccattatcggttcaccacttatgtaggaaacaaggaaattgaaaaccttattcga 

57 EWRFFECPKCHYRFTTYVGNKEIENLIR 

25592 tttagaaatacttgtcgagctaaaatgaagcaggaacttcaaaaaggagctgctgetaatcaaaacacttaccattcatatcga 

85 FRN TCRAKMKQELQKG AAANQNTYHSYR 

25676 attcaggatgagcaagctgggcataaaatctcagggcttatggcgaagctaaagaaggagataaacattgaaaaacgagaaaaa 

113 IQDEQAGHKISGLMAKLKKEINIEKREK 

25760 gaatgggtatctatatag 25777 

141 E W V S I * 
dplORF046 

42774 atgccaatgtggctaaacgacacagcagtcttgacgacgattattacagcgtgcagcggagtgcttactgtcctactaaataag 

1 MPMWLNDTAVLTTI ITACSGVLTVLLNK 

42858 ttattcgaatggaaatcgaataaagccaagagcgttttagaggatatctctacaactcttagcactcttaaacagcaggtcgac 

29 LFEWKSNKAKSVLEDISTTLSTLKQQVD 

42942 gggattgaccaaacgacagtagcaatcaatcaccaaaatgacgtcattcaagacggaactagaaaaattcaacgttaccgtctt 

57 GIDQTTVAINHQNDVIQDGTRKIQRYRL 

43026 tatcacgacttaaaaagggaagtgataacaggctatacaactctcgaccattttagagagctctctattttattcgaaagttat 

85 YHDLKRBVITGYTTLDHFRELS1LFESY 

43110 aagaaccttggcggaaatggtgaagttgaagccttgtatgaaaaatacaagaaattaccaattagggaggaagatttagatgaa 

113 KNLGGNGEVEALYEKYKKLPIREEDLDE 

43194 actatctaa 43202 

141 T I * 
dplORF047 

47542 atgaaatttgaagatgaaaaacagttcatcgctgcaattgaagaagccggtgaattaaatgctaccaaaggcgacatggagaaa 

1 MKFEDEKQFIAAIEEAGELNATKGDMEK 

47626 caagtcaaaagtcttcgtgatgctctaaaagagtacatgaaagaaaatgacattgaatctgctcaaggtaagcacttttctgct 

29 QVKSLRDALKEYMKENDIESAQGKHFSA 

47710 accttctacacgacagagcgctcaactatggacgaagaacgcttgaaagaaattatcgaaaaattagttgacgaagccgagacg 

57 TFYTTERSTMDEERLKE1 I EKLVDEAET 

47794 gaagaaatgtgtgaaaaactttcagggcttatcgaatacaagcctgtcatcaatacgaaacttctcgaggatatgatttatcac 

B5 EEMCEKLSGLIEYKPVINTKLLEDMIYH 

47878 ggcgagattgaccaagaagcaattcttccagcagttgtcatttctgttacagaaggcattcgttttggaaaggctaaaatttag 
47961 

113 GEIDQEAILPAVVISVTEGIRFGKAKI* 
dplORF048 

16709 atggaaacaacactttatttcggttaccttacagcagattggaaagacggtcacaagaactacactttccactatgaaagcatt 

1 METTLYFGYLTADWKDGHKNYTFHYESI 

16625 cctgtaaaagaaactgagaaacaatataaggtcactggaatcaatcctaacttgtacttagacctaggctcagttattagaaag 

29 PVKETEKQYKVTGINPNLYLDLGSVIRK 

16541 agcgaacttgacattgcagtattcaaagcatgtcctgccgctgaaactggagtcacacttactcgcgacatggaagttgatgct 

57 SELDIAVFKACPVAETGVTLTRDMEVDA 

16457 agaattgaaatcatcaagaaattaactacaagaatcgaacgccttaacgaaagaattaaagcaagaaatgaacaaggtaaacaa 

85 RXEI I KKLTTRI ERLNERI KARNEQGKQ 

16373 gaaagccgccacctagtatctgcgctagaagattgcgctcgtcaaattgctggaatttatcaataa 16308 

113 ESRHLVSALEDCARQIAGIYQ* 
dplORF049 

44018 atgtttcaaccatttctcagcgagcatgtagccttggtcgtcaaagtagaaccaagacttgttttcttcgatatactcgaactc 

1 MFQPFLSEHVALVVKVEPRLVFFDILEL 

43934 atcttttggataagttccgtttgctcgagcgtaccagaaaccagtagcatctttctgccagccaagtttcttctcagccggttg 

29 IFWISSVCSSVPETSSIFLPAKFLLSRL 

43850 agcatttgcgttagtcaagcgatagacgtagtagtaaggttgacctgcatagtaccaacgctcatcgtggtcgttgacggaaat 

57 S ICVSQAIDVVVRLTCIVPTL IVVV DG _N 

43766 tccgtcgtaggcgtagttgcagtgaatgatgttatcactgtcaatgaacatccctgtatgacctccagcgcctacgctagcacc 

85 SVVGVVAVNDVITVNEHPCMTSSA^-CAST 

43682 tttgcgtccccagatgaagatgtcgcctcgtttagcatcccacggagcattttcactaattag 43620 

113 FAS PDEDVASFS I PRSIFTN* 
dplORFOBO 

15081 atgaacaatcagcgaaagcaaatgaacaaacgaatcgtcgaacttcgcgaagactatcaacgtgcaagaggtcgaataaacttc 

1 MNNQRKQMNKRIVELREDYQRARGRINF 
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15165 cttcttgctgtaaaggaccacggcgaagaactcgaaaaccttgaagcctttgtgggatacattgacaatctagtcgaatgtttt 

29 LLAVKDHGEELENLEAFVGYIDNLVECF 

15249 cctgaaagccaacgaaatgtcttgaggctatgtgtattagatgaccttccagtcactaatgcggccgctgaaattggataccac 

57 PESQRNVLRLCVLDDLPVTNAAAEIGYH 

15333 tatacatgggttcaccaacttcgagacaaagcagttgaaacacttgaagaaattttagatggggataacattattcgctctaaa 

85 YTWVHQLRDKAVETLEEI LDGDNI IRSK 

15417 cacggaatcgaaattaaggagaaacttgatgaattatatggtaaaagtcattctagttag 15476 

113 HGIEIKEKLDELYGKSHSS* 
dplORPOSl 

29765 atgagttatgacgtgaactatgtcaagaatcaagttcgtagagccattgaaaccgctcctactaaaatcaaggtacttcgaaac 

1 MSYDVNYVKNQVRRAIETAPTKIKVLRM 

29849 tcttgggtcagtgatggatatggaggaaagaaaaaggataaagcgaatgaagtcgtagcagacgaccttgtttgtttagttgat 

29 SWVSDGYGGKKKDKANEVVADDLVCLVD 

29933 aattcaactgttcctgaccttttagccaattctactgacgcgggaaaaatttttgcccaaaatggagtgaaaattttcattcta 

57 NSTVPDLLANSTDAGKIFAQNGVKIFIL 

30017 tatgatgaaggcaaaatcattcaacgagccgatactatcgaaattaaaaactcaggaagacggtacagggtagtagaaacccac 

85 YDEGKI IQRADTIEIKNSGRRYRVVETK 

30101 aatcttctcgagcaagacattttgatagaacttaaattggaggtgaacgactaa 30154 

113 NLLEQDILIELKLEVND* 
dplORF052 

30516 atgactaaacgaacgacaatgatggacagattgaaggaaattcttcctacatttcagctctcgcctgctcctatgcttccagga 

1 MTKRTTMMDRLKEILPTFQLSPAPMLPG 

30600 gttgaatttgacgagcaagatacagataggccggatgactacattgttcttcgatatagtcatagaacgcccagcgcaacaaat 

29 VEFDEQDTDRPDDYIVIiRYSHRMPSATM 

30684 agcctaggaagttttgcttattggaaagttcaaatctacgtccattcaaactcaattattggtatcgacgaatatagcagaaag 

57 SLGSFAYWKVQIYVHSNSIIGIDEYSRK 

30768 gttcgaaacattatcaaggacatgggctacgaagtaacctatgcagaaactggtgactacttcgacacaatgctttctagatac 

85 VRNI IKDMGYEV T"Y AETGDYFDTMLSRY 

30852 cgactagaaatcgaatatagaattccacaaggaggaaactaa 30893 

113 RLEIEYRI PQGGN* 
dplORF053 

50300 atgctaacattcgaaagaatagtatctatacgagcaccaacttgcacttcactcatttccccgctatatagaaggacatcatgc 

1 MLTFERIVSIRAPTCISLI SPLYRRTSC 

50216 ccgttcttccaagcagttgcaagcattttatcaatagtccacgacttaccttgtccaggtcgagccattatgacaatcaaatcc 

29 PFFQAVAS ILSIVHDLPCPGRAIMTIKS 

50132 tcaccaggaagtaagcctccaagcacgtcgtccaatagttcaaaccctgtcgatattccaagtctttcaccgtcatggtttcta 

57 SPGSKPPSTSSNSSNPVDIPSLSPSWFL 

50048 atagtattcgcccagtctagtcgaagtttagcatttcgagcaatgtctagtccgcctacgaatttagagcgattgaaaagttct 

85 IVFAQSSRSLAFRAMSSPPTNLERLKS S 

49964 tccagttttggaattatattcgcaatcgcaatgttactatctacttga 49917 

113 SSFGIIFAIAMLLST* 
dplORF054 

14423 atgtgtgaaaattgtcaaaacgaaacattcaatactagaattttcaatgaagatgaaagtggctatgtcgacgcctcattcact 

1 MCENCQNETFNTRI FNEDESGYVDASFT 

14507 tacaaggagattcgcgacaccgcagcagctattagcaatcgagcggtagaaaagaaagaccgtgacagccttttagtcgctaca 

29 YKEIRDTAAAISNRAVEKKDRDSLLVAT 

14591 gttatggctcttcccgtttctcacgcagaagatttaggcaagagactttgtattgcaaattctcgattggaagcattccgtgaa 

57 VMALPVSHAEDLGKRLCIANSRLEAFRE 

14675 gctgttcaagaggctctcgagaatgaaaaggctgaagatttaaaggacgttatcttaggtcttatcgacgttgacaaaaaaatt 

85 AVQEALENEKAEDLKDVI LGLIDVDKKI 

14759 ggcaaccttgcattgcaattagttgaatcaggagcattataa 14800 

113 GNLALQLVESGAL* 
dplORFOSS 

27627 atgcctaatgtgcgagttaagaaaactgattttaatcaaaccactcgaagcattgtcgcaattcctgaccactacgttgctttg 

1 MPNVRVKKTDFNQTTRS IVAI PDHYVAL 

27711 gctgctcaaattccagctaccgcagcaactcaagtagggaacaagaaatacattcttgccggaacttgcgtgaaaaatgctact 

29 AAQI PATAATQVGNKKYI LAGTCVKNAT 

27795 acatttgaaggacgcaaaactggactcgaagtagtatctaccggtgaacaattcgacggagttatcttcgctgaccaagaagcg 

57 TFEGRKTGLEVVSTGEQFDGVI FADQEV 

27879 ttcgaaggtgaagaaaaagtaaccgtgacagtattagttcacggattcgtcaaatatgcagcccttcgaaaagttggcgatgct 

85 FEGEEKVTVTVLVHGFVKYAALRKVGDA 

27963 gtgcctgaatctaaaaacgcaatgattcttgtcgttaaatag 28004 

113 VPES KNAMILVVK* 
dplORF056 

19151 atggaaaataaatggaaagttatccattttcaaaactcatgtattaaacaagtagacgatgaaaaaaggaggctcctgttcgaa 

1 MENKWKVI HFQNSCIKQVDDEKRRLLFE 

19067 gtcccaggaactccttatcgtctacaagtttgggtgaaaatgagcttagttaaaattgaaacacgcgcaggaaacggctattat 

29 VPGTPYRLQVWVK.MSLVKIETRAGNGYY 

18983 aaaaggctagtatgccaagacgattttgtattttatggtaaggagtcaatagatggttacttaattgacgccacqataactggc 

57 KRLVCQDDFVFYGKES IDGYLIDA^TITG 

18899 aaatctttggcggaatattgtgagcctatgaacaggcatattctcgaaactattgcatcgcgagaagcagctgaactgaacaga 

85 KSLAEYCE PMNRHI LETIASREAAELNR 

18815 gctaaaaagcaagaccaacagaaatggagatactag 18780 

113 AKKQDQQKWRY* 
dplORF057 
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98 59 atgcaaaaatctctatttggacctaagctagtgcctgctagttcaaggcgcaagaaaagaacggttccaaaacctaaacctaaa 

1 MQKSLFGPKLVPASSRRKKRTVPKPKPK 

9943 atcgatgagcaagtggttgagcttatgaaccgcagagagcgtcaagtgcttgttcatagttgcatctattattattttaatgac 

29 IDEQVVELMNRRERQVLVHSCIYYYFND 

10027 tcaattatagcagacgggcagtatgacaaatggagccacgaactatattctcttatagtttcgcaccctgatgagtttcgacag 

57 SIIADGQYDKWSHELYSLIVSHPDEFRQ 

10111 actgttctctataacgagtttaaacagtttgacggaaatactggaatgggtcttccatacgactgtcagtttgctgtaagggtc 

85 TVLYNEFKQFDGNTGMGLPYDCQFAVRV 

10195 gcagaaaggcttttaagaaaatga 10218 

113 AERLLRK* 
dplORFOSS 

15633 atgacatcacgcgcatacaaaccaattcccacgcgcagagctagtgctaaacaagagaaggcagttgctaagcagttgggagga 

1 MTSRAYKPI PTRRASAKQEKAVAKQLGG 

15717 aaagtacagcctaattcaggagccactgactactacaaaggtgacgtcgtaacagactcaatgcttatagaatgcaagacagtt 

29 KVQPNSGATDYYKGDVVTDSMLIECKTV 

15801 atgaagccacaaagttcagtcagcttgaaaaaggaatggttcctaaaaaatgaacaggaaaggttcgctcaaaaactcgactat 

57 MKPQSSVSLKKEWFLKNEQERFAQKLDY 

15885 tctgctatcgctttcgactttggtgacggaggcgaacagtatatagcaatgtctataagtcagttcaagcgaatattagaggat 

85 SAIAFDFGDGGEQYIAMSI SQFKRI LED 

15969 agaaatgataaccttatttaa 15989 

113 R N D N h I * 
dplORF059 

30154 atgtctcagcctgaattagtatggaagcctgaagaatttgttagtaactgtgaacggtatcgaaacaagtttcaagtcgctgcc 

1 MSQPEIiVWKPEEFVSNCERYRNKFQVAV 

30238 ataacagtctgcgaagtcgctgctactaagatggaagaatacgcaaagacgcatgctatttggacagaccgtacagggaatgct 

29 I TVCEVAAT KM E EYAKTHA I WTDRTGNA 

30322 cgacagaaactcaaaggagaagctgcttgggtaagcgcagaccaaatcatgatagctgtatcacatcacatggactacgggttt 

57 RQKLKGEAAWVSADQIMIAVSHHMDYGF 

30406 tggctagaactagctcatggtcgaaaatacaaaattctcgaacaggctgtagaagacaatgtcgaagaactttttagagcgttg 

85 WLELAHGRKYKILEQAVEDNVEELFRAL 

30490 agaaggttattagactag 30507 

113 R R L L D * 
dplORF060 

38070 gtgatagctgtatctgctatccctactccgctctttccaggtacaccgtcgactccatcacgcccaggagctcccggtaaacct 

1 VIAVSAIPTPLFPGTPSTPSRPGAPGKP 

37986 gcgtcacctttaggaccttctagtcgaatccatgtaaagtcgtcaggaactaattcgctcggtttcttattagtattaaggaca 

29 ASPLGPSS RIHVKSSGTNS LGFLLVLRT 

37902 ccaatgtatttcccagattctgcattaaaattagtccctaaaatgtcatctgcgtatctaataacaacttgggactcatttaca 

57 PMYFPDSALKLVPKMSSAYLI TTWDSFT 

37818 gtttcccctgaaaggactccttcgccgtccccatttagcaagtccatcaagtcttttcgagggtcttggaaaatgatagtagag 

85 VSPERTPSPSSFSKSIKSFRGSWKMIVE 

37734 tttgaaaggtcgtcgtag 37717 

113 F E R S S * 
dplORF061 

19475 atggcgagaatgcaaagattatgcccgatgaaattttggaaggcggtaactaaaatgaaattcgaagtttattctgcgcgacta 

1 MARMQRLC PMKFWKAVTKMKFEVYSARL 

19391 tttgacgaagaggcgacatatgataggtatcgtgaagcactagagaaagttggaaatgtcgcttacttttgtgaaattgatact 

29 FDEEATYDRYREALEKVGNVAYFCEIDT 

19307 ggcaaccttgtaatcgaactcgagctagacagcctagatgacctaatcgcgctttcaaatgtagtgggaactggactaaaatta 

57 GNLVIELELDSLDDLIALSNVVGTGLKL 

19223 tcacggccetatagagaagataagccttttcaattatggattgttgacgggtacatggaataa 19161 

85 SRPYREDKPFQLWIVDGYME* 
dplORF062 

45284 gtgagaagcttcaatcaattccatcgcggtgtcaatatcttcttccctgacgagtttaaaaattccgtcaatcgcccattcgta 

1 VRSFNQFHCGVNI FFLDEFKNSVNRPFV 

45200 agatgcaggagcaatagatgcaagaagtttcttttggtcttctgtcaacccttttgcgcgaactccaatagaaacaccttttcg 

29 RCRSNRCKKFLLVFCQPFCAMSNRNTFS 

45116 agtttcttcgatagtaacgaagttcttcttagagcgataggcgatgtaagactgagtgacgattcgagtagacgcaggaaaggc 

S7 SFFDSNEVLLRAIGDVRLSDDSSRRRKG 

4S032 ttcaacaattcgactttcaagagccttagtaatcgccatcacgctttcttttttaggagcaggttttcgaacagtagatttctc 

85 FNHSTFKSLSNRHHAFFFRSRFSNSRFL 

44948 actaactga 44940 

113 T N * 
dplORF063 

47200 atgaaattcactgaaggaaaaaattggtataaagttggagagatatgtcaaatgttgaaccgctctctatctacgattaatgtt 

1 MKFTEGKNWYKVGEICQMLNRSLSTINV 

47284 tggtatgaagcaaaagacttcgccgaagaaaataacattcacttcccgtttgttcttcctgaacctagaacagaccttgaccac 

29 WYEAKDFAEENNIHFPFVLPEPRTD LDH 

47368 cgtggttctcgattctgggatgacgaaggcgtgaacaaactcaaacgacttagggacaacctaatgcgcggtga_ctt-ggcattc 

57 RGSRFWDDEGVNKLKRFRDNLMR G _ D L A F 

47452 tacactcgaactcttgtagggaaaactgaaagggaagcaattcaagaagatgctaaagcatttaaacgtgaacatggattggag 

85 YTRTLVGKTEREAIQEDAKAFKREHGLE 

47536 aattaa 47541 

113 N * 
dplORF064 
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29108 at^acattgaaa^ 

29276 LaattgaaLagttgaagaaattgacgaagttgaacaaatgcgcgaagagtatgcggctaaaaccg^ 

TIES VEEIDEVEQMRBEYAAKTVPELVE 
29360 ttagcaagagctaatggaattgacatttcttcaatttctcgaaaaagc^ 

65 LAR ANGIDISSISRKSEYIDALIKYELG 

29444 gagtaa 29449 
113 E * 

fwT 65 atgcagtttgtcataacctacatcaaacatctcgatgagctcgtccgccaatttccgttcatacatataaggatgaataaaccg 
1 m q fvITYIKHLDELVRQFPFIHIRMNK P 

51413 gtatttatcaagttcetctteaggaatgatt^ 

VFIKFLFRNDFMLDFFSSPISSKRFRAD 

51329 gccttgcctaactacttcgctagacgtcccaaaattccttttcagccactggtttccatagaaccctccatcgtctcgacctaa 

I"' 6 ALPNYFARCSKIPFQPLVSIEPSIVST* 

2?89f° 66 gtgaccaactgcgccaggtggaagcaataccactttaccgtcgtcaatcaagttgaactgacgaatgttaccaacg^ 

1 vtncvrwkqyhftvvnqveltnvtnvrk 

28814 tttgtcagcgtcagcgaactgagcaattttcttagagtagacagcgatttgaagaw 

29 fvsvselsnflrvdsdlktcffsdef ls 

28730 gtcacttgcaagaagcaagaagttttcccaagaaccttgaacaccaattgcaagag^ 

c 7 vtckkqevfprtlntncksfldrvtlsh 

28646 ttggctataagtgcttcggttcaagaccattcgagtagggcgaacacctgtacgattttcgatgtcatccattgctgctaa 

28566 



85 LVISVSVQDH 



RANTCTIFDVIHCC 



«J°f° 67 gtgacgattcgagtagacgcaggaaaggcttcaacaattcgactttcaagagccttagtaatcgccatcacgctttctttttta 

1 vtirvdagkastirlsralviaitlsfl 

44977 ggagcaggttttcgaacagtagatttctcactaactgaaccaacttcttccggctgttccttaacttcaggaatttcttcct^ 

29 GAGPRTVDPSLTEPTSSGCSLTSGISS S 

44893 aggacttcttttttaggcttgggaacgactccaccttttcgagcaggtcgagcaactgcaggagcagcctttttagcaggttta 

57 R T SFLGLGTTLPFRAGRATAGAAFLAGL 

44809 gcagcttcttcttttttaggttcagtttcatcttccattgtgcaccaacgttcgagagttgaagctgaaaggtga 44735 

85 AASSFLGSVSSSIVYQRSRVEAER* 

dplORFOSB tcaaa ttgaattagtcaaaatcaata tcgataacgataattctccgtcaccaatgactgac« 

1 M AAQT DIELVKINIDNDNSPSPMT. DQS I 

29535 tcagctcttttagacaagcataaatctgtcgcctatgttagttatatgatttgcttaatgaagacccggaatgacgtggt 

29 SALLDKHKSVAYVSYMICLMKTRNDVVT 

29619 cttggacctatcagtctaaaaggtgacgcagactactggaaacaaatgg^^ 

57 L GPISLKGDADYWKQMAQFYYD Q Y K Q E Q 

29703 cttgaaactgatgaaaagtcgaacgctggttcgacaatcttaatgaaaagggctgatgggacatga 2976B 

85 letdeksnagstilmkradgt* 

2^!l P ° 69 atgaaactttatcacgccactgattttgataatc^^ 

1 MKLYHATDFDMLGKILAEGLKPSAGVI Y 

20327 ctagcagaaagttatgaaaaggctctagcctttttatcgctt^ 

29 laesyekalaflslrnvdtivvlelevd 

20243 attgaaaaatgtaccgaaagtttcgaccat^ 

57 IEKC TESFDHNEKMFCSLFHFD T C R AWT 

20159 tatgacaagacaattgaagtagacgacactgacttttcgaaagctcgaaaatatgatagaaagtga 20094 

85 YDKTIEVDD IDFSKARKYDRK* 
dplORF070 ^ 

1 M I TLFKINS EGTVTPI KGSAMQLYAD L I 

16057 cctatacaagaggacgatatacagttcgttgatataactggacttgaccctattgttcgagaaaacgtacttgagctcattt^ 
29 piQEDDlQFVDITGLDPIVRENVLELIS 
16141 cggagccgtgtaggagtttcaaaacatggcacaaacctcgaccagaacgatgtcgacgatttcctacagcacgccaaagaagaa 
57 R S R VG VS KYGTNLDQNDVDD FIjQHAKEE 

16225 gcgcccgactttgctaactacctaaccaagctacaaagtcaacaaaagcaaaataaacag 16284 
85 ALDFANYLTKLQSQQKQNK* 

3^90^ 071 gtgaaacaggccctagaggagttcaaggtcttcaaggtcctcaagggcttcaaggaattcctggacctgcaggagctgacggac 
1 vkqvleefkvfkvlkgfkefldlqeltd 
38988 gttcgcaatatactcacctcgctttctctaatagtccaa^ 

29 VRNILTSLSLIVQTVRDLVILTAD.E-H-TS 
39072 gtcagtatcaagatttcaatcccgcccattcaaaagaccctgcagcctatacatggacgaaatggaaggggaatgacggagctc 
57 VS IK ISIPSIQKTLQPIHGRNGRGMTEL 

39156 aagggatacccgggaagccaggcgcagacggtaagactaattatttccatatag 39209 
8 5 KGYPGSQAQTVRLIISI* 

dplORF072 atgttccttcgtcttcaagttgtctcg a aagtttttca a t tatttgctcaggagtcgct:tcaatttgaagaccacttactttca 
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1 MFLRLQVVSKVFQLFVQESLQFEDHLLS 

50961 tcaaaatgcttcaactccttcccttgtaaccttacttcgaagacgagcagtcgacctagaggcttttgctttcaatggagagct 

29 SKCFNSFPCNLTSKTSSRPRGFCFQWRA 

50877 ttcgcctttttcagttccttcttcgccttcctctttgaatcctataagagtataggttccagtttcaacgtcccacatatattc 

57 FAFFSSFFAFLFESYKSIGSSFNVPHIF 

50793 gatgatttttcggtcttcgccatatcggcttttaacgacagatag 5074 9 

85 DDFSVFAISVFNDR* 
dplORF073 

14262 gtgaacgcttgccggaagaatacgacgaagaaacttgggaacctatcactgaagcagaatacatcaagcgaacagaaaaaccta 

1 VNACRKNTTKKLGNIiSLKQNTSSEQKNL 

14346 aagcagttgcaaaacctactcgaaaaactccagcgccctctcgtcgccctcgcccttaaaagaaaggttgaaataaaatgtgtg 

29 KQLQNLLEKL QRLLVALALKRKVEI KCV 

14430 aaaattgtcaaaacgaaacattcaatactagaattttcaatgaagatgaaagtggctatgtcgacgcctcaetcacttacaagg 

57 KIVKTKHS I LEFSMKMKVAMSTPHSLTR 

14514 agattcgcgacaccgcagcagctattagcaatcgagcggtag 14555 

85 RFATPQQLLAIER* 
dplORF074 

32298 gtgacgaaaagaaaaatccaggattgcaaatgcttatggagtgactattttcagtcgctcctctttttgtatatagaaaggaaa 

1 VTKRKIQDCKCLWSDYFQSLLFLYIERK 

32382 ttacatggattttgggtcaattgcagcaaaaatgactttggatatctcaaacttcacaagtcaattaaatcttgctcaaagtca 

29 LHGFWVNCSKNDFGYLKLHKSIKSCSKS 

32466 agcgcaacggctcgcactagagtcttcgaagtcctttcaaattggttctgctttaacaggattagggaaaggacttacgactgc 

57 SATARTRVFEVLSNWFCFNRIRERTYDC 

32550 ggttacccttcctcttatgggatttgcagccgcctctattaa 32591 

85 GYPSSYGICSRLY* 
dplORF075 

22447 acggcaaagttttgtccgttgaattccgtcatggcccaaagggaaaatgaaagagccatcgatactgtttttcctgaacgaatg 

1 MAKFCPLNSVMAQRENERAIDTVFPERM 

22363 gaaccgtctgctatgacgatatcgaaagttcgaaaaggtgagccctttgtccaccatgttaggagctggagttgtttcttacta 

29 EPSAMTISKVRKGEPFVHHVRSWSCFLL 

22279 aaagggacgaagttgaacttaggtagtttatttctcaggcttattgtcattatcagtcactcctttaatgtaggaacctgttgc 

57 KGTKLNLGSLFLRLIVI ISHSFNVGTCC 

22195 gtcactaaattcttgccaaacggcttgagctgctttatctag 22154 

85 VTKFLPNGLSCFI * 
dplORF076 

5728 gtgagagcattttcttcactcacgtcttcgagcaagtggtcgaatgtagggtactcttcatcttctgtaacaatatcaatattg 

1 VRAFSSLTSSSKWSNVGYSSSSVTISIL 

5644 tactcaccattcccaataacttttagcgaagattcttcaggaactaatgtgacggttgcggccgtggtcttttctacaagtttt 

29 YSPFPITFSEDSSGTNVTVAAVVFSTSF 

S560 ccaaactgctctgctttcacaatcacgtcaatttcaacatcgctgtcgataatgcatcgaaggaagtttgagccatcatacgct 

57 PNCSAFTITSISTSLS IMHRRKFEPSYA 

5476 gtaaacatgacgcattcgccgtcaccaaaaatatgccaacag 5435 

85 VNMTHSPSPKICQ* 
dplORF077 

14800 atggaacgaataaagacgctatttcacgcgatttatgctaacggcactcacttagaagtagcagctttgttcgataccgttgat 

1 MERIKTLFHVIYANGTHLEVAALFDTVD 

14884 gattatgatgacgttatagaggacatccaggggtatattgatacccctgacctttataatcaaaggagcattagaatggcgcct 

29 DYDDVIEDIQGYIDTPDLYNQRSIRMAP 

14968 tacaatcctgacatcaatggtgacgctattgctactgacattttactacgactagatgatattatctacgtcgacgcaacttgt 

57 YNPDINGDAIATDILLRLDDI IYVDATC 

15052 gaaactattaaatacgaggagcctattgcatga 15084 

85 ETIKYEEPIA* 
dplORF078 

17507 atggcaacagtaaaggaaacagtaaaacttgacggacgtcttgtaactatcttcgactacgacgatttagagtgggaaggatat 

1 MATVKETVKFDGRLVTI FDYDDLBWEGY 

17423 gcacctaatgaaggattcgaagatgttgaggacatggaagtccttagcattcgagttcgaaacgaaggtgaggacgacgagtgg 

29 APNEGFEDVEDMEVLSIRVRNEGEDDEW 

17339 gttgaagttatcgcctgctatgaaaacgatgacgaggacgaagatttggaagggttataa 17280 

57 VEVIACYENDDEDEDLEGL* 
dplORF079 

35288 atggaactgataccattgataaatcctcgaacaaggttgacccctgcgcttaccatttgtccagcgaatccagtaaccttagaa 

1 MELIPLINPRTRLTPALTICPANPVTLE 

35204 acaattgaagttcccatgctgccaattctagagacagctgaaccaatcattgacccaataccactaatgaagtttcgaatcagg 

29 TIEVPMLPILETAEPIIDPIPLMKFRIR 

35120 ttcgcacctcctgaaaccatctgtcccacaaagctagcaatcttgctaactaatgatgaaagcatgtttccagctgtcgataaa 

57 FAPPETICPTKLAI LLTNDESMFPAVDK 

35036 agtgagccgagaagtgaagcaataccttga 35007 

85 SEPRSEAIP* 

dplORF080 ... 

42490 atgttgaaccttacaaaatcgcgccaaattgtggcagagttcactattggacaaggagctgaaaagaaacttg<^a"aacaacg 

1 MLNLTKSRQIVAEFTIGQGAEKK L-V K T T 

42574 attgtgaacattgacgcaaacgcagtatcaaccgtctctgaaactcttcatgacccagacttgcatgctgcgaaccgtcgagaa 

29 I V N I DANAVSTVSETLHDPDLYAANRRE 

42658 cttcgagctgacgagcaaaaacttcgcgaaactcgttacgcaatcgaagatgaaattccagctgaacagtcaaagactgaaaca 

57 LRADEQKLRETRYAIEDEILAEQSKTET 

42742 gctctaacagctgaataa 42759 
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85 A L T A E * 
dplORF081 

55466 atgttcaggaacagtatcgtccatctgctggtctgcgtcaaagttaaaggggtcgaaatcttcgttcttgctagcgtcgatata 

1 MFRNS IVHLLVCVKVKGVE I FVLASVDI 

55382 ctcgaactcgtattcaggaagactcataccaggaagccttcttcttcgaccggtagctgtttgaacatatcccaagtcctgcgc 

29 LELVPRKTHIRKPSSSTGSCLNISQVLR 

55298 ctgctgttgaacgaatatgatatagtctgccaccttagggaactcggtgaagaaatcttcaataaccttattcgcttctttgac 

57 LLLNEYDIVCHFRELGEEIFNNLIRFFD 

552X4 agatacattcatctgctcagcgattga 55188 

85 RYIHLLSD* 
dplORF082 

44728 gtgaacttcacctttcagctccaactctcgaacgttggtacacaatggaagatgaaactgaacctaaaaaagaagaagctgcta 

1 VNFTFQLQLSNVGTQWKMKLNLKKKKLL 

44812 aacctgctaaaaaggctgctcctgcagttgctcgacctgctcgaaaaggtagagtcgttcccaaacctaaaaaagaagtccttg 

29 NLLKRLLLQLLDLLEKVESFPNLKKKSL. 

44896 aggaagaaattcctgaagtcaaggaacagccggaagaagttggttcagttagtgagaaatctactgttcgaaaacctgctccta 

57 RKKFLKLRNSRKKLVQLVRNLLFENLLL 

44 980 aaaaagaaagcgtga 44 994 

85 K K K A * 
dplORF083 

35974 atgccttcagggtttttaaatcctgagtccttaaatcctgcgaaagtgagtcctacatattctagcacggttgcacctttgtcg 

1 MPSGFLNPESLNPAKVSPTYSSTVAPLS 

35890 acaaggtcaattccgtcgaccaatagcgtctgtctgctagccatctatttctcctttacggtgttacaatgttaccaaaccctg 

29 TRSIPSTNSVCLLAIYFSFTVLQCYQTL 

35 8 06 atagagtttctttacttctattatacaatcctctcgacagtttgtcaacgtcgtcattgtttcgaactacgattgttccaatgt 

S7 I EFLYFYYTI LSTVCQRRHCFELRLFQC 

35722 tga 35720 

85 * 
dplORF084 

15445 atgaattatatggtaaaagtcattctagttagtgtctttgtactgtcagccttttgcatgacttgctcaatggtttatttggtt 

1 MNYMVKVILVSVFVLSAFCMTCSMVYLV 

15529 acaggtaagcaagaggaccaccgtagtaccgtcgcccttgtatttggcgctctcgtaagctctgcggcgttctattcgacactc 

29 TGKQEDHRSTVALVFGALVSSAAFYSTL 

15613 tttatcctcgcctatctgccatga 15636 

57 FILAYLP* 
dplORFOSS 

10847 gtgatgactataatcaaggactttttcgagccttgtgatactgtcacgcattcctccatttgcaagtttcccaataaacgaaag 

1 VMTI I KDFFEPCDTVTHSS ICKFPNKRK 

10763 ggcgtcacgctcataactataaccagctccttcttcattttcactttcgataataaattgaagttgattaacgatgtcgtcatt 

29 GVTLITITSSFFIFTFDNKLKLINDVVI 

10679 atcaattcgagtaaagtcaaaccgttgaactcgactgagaatagtgtcaggaatcttttgagggtcagtagtacatag 10602 

57 1NSSKVKPLNSTENSVRNLLRVSST* 
dplORF086 

52760 atatgggaaaagtatcaattcaaaaatcaggaacatttagctcagggtctaataacgagtttttcacactcgctgaccacggtg 

1 1WEKYQFKNQEHLAQGLITSFSHSLTTV 

52844 acagcgcaattgtcactctattgtatgatgacccggaaggcgaagacatggattatttcgtag 52906 

29 TAQLSLYCMMTRKAKTWI IS* 
dplORF087 

30036 atgattttgccttcatcatatagaatgaaaattttcactccattttgggcaaaaatttttcccgcgtcagtagaattggctaaa 

1 MI LPSSYRMKI FTPFWAKI FPASVELAK 

29952 aggtcaggaacagttgaattatcaactaaacaaacaaggtcgtctgctacgacttcattcgctttatcctttttctttcctcca 

29 RSGTVELST KQTRSSATTS FALSFFFPP 

29868 tatccatcactgacccaagagtttcgaagtaccttgattttagtaggagcggtttcaatggctctacgaacttga 29794 

57 YPSLTQEFRSTLILVGAVSMALRT* 
dplORF088 

5040 atgaaaaaagttcaaacttatcaagaatatctaaaactagttgagttcaaacgtcaactttctttaaatcttcgagaaggaaaa 

1 MKKVQTYQEYLKLVEFKRQLS LNLREGK 

5124 ataggagtcgatgaagcggttattcaattattcaccttctatagtttcaacaatatcgaggaacctcctttcattgtactcaaa 

29 IGVDEAVIQLFTFYSFNNI EEPPFIVLK 

5208 atgcaagaggctgccgtgaacgggacttatgaagcaaaactcaatatgcttaaaagatctaaaattatttag 5279 

57 MQEAAVNGTYEAKLNMLKRFK I I * 
dplOR7089 

12495 atgtcaatcatgtcgctatcaacagtcgagtatttagacacaaaatgcctcttcaactgcgcgtcagtcactttctcaaactca 

1 MSIMSLSIVEYLDTKCLFNCASVIFSNS 

12411 acacaattatcaggaaaggcctttagcaacttgcttcgcttgccaactttagtaaccaccaaaacaagtgtcccatatctaaca 

29 TQLSGKAFSNLLRLSILVTIKTSVPYLT 

12327 tccggaagccttttccacctcgactcattagacagaaactccttatcatctcgaacagcgaatattcgatga 12256 

57 SGSLFHLDSLDRNSLSSRTANIR* ... 
dplORF090 - -'* 

27037 atgctaaaattttcattgacggcgacggtcaacattttgtacctcacgcacgtttcgatgaagttgctcaacagcgcgatgcag 

1 MLKFSLTATVNILYLTHVSMKLFNSAMQ 

27121 ctaacggctcaattaattcttataaagaacaagtcgcgacgctttctaaacaggtcaaagacaacggtgatgcgcagaccacta 

29 LTAQLILI KNKSRRFLNRSKITVMRRPL 

27205 tccaaaaccttcaagagcaactcg'acaagcagtctcaacttgcaaaaggcgctgtga 27261 

57 SKTFKSNSTSSLNLQKAL* 
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dplORF09X 

43189 atgaaactatctaacgaacaatatgacgtagcaaagaacgtggtaaccgtagtcgttccagcagcgattgcactaattacaggt 

1 MKLSNEQYDVAKNVVTVVV PAA I A L I T G 

43273 cttggagcgttgtatcaatttgacactactgctatcacaggaaccattgcacttcttgcaacttttgcaggtactgttctagga 

29 LGALYQFDTTAITGTIALLATFAGTVLG 

43357 gtttctagccgaaactaccaaaaggaacaagaagctcaaaacaatgaggtggaataa 43413 

S7 VSSRNYQKEQEAQNNEVE* 
dplORF092 

46989 atgaaaactatctccatattaaggaaagacactaaaaggaagccggacaggaacggaagaaaaactgcactcgaactagctcaa 

1 MKTISILRKDTKRKPDRNGRKTALELAQ 

47073 gagattgatatgtcacctagtgagttagcagagctccttcaaattcctgaaaggacggcaaccagaattttaaaactcgacaaa 

29 EIDMSPSELAELIiQIPERTATRILKLDK 

47157 ctgctcaacaaagagcaatgctcaataatagaaaggtatataaatgaaattcactga 47213 

57 LLNKEQCSIIERYINEIH* 
dplORF093 

45756 atgcaacatacgattaaacaatgtttgaaacttgcctccctgctaactgcaatatcaattgcccgtttagttttccctaaacct 

1 MQHTI KQCLKLAFLLTAI S IACLVFPKP 

45672 tgctcatcgcctaaaaggaaacatggatgctcttgtgcgtattcgaaacattcaacctggtgcgcgaatggagtagtcttgaac 

29 CSS PKRKHGCSCAYSKHSTWCANGVVLN 

45588 gaaaactgctcattgcttgaagaagctattcggtttcgagagtcaatgtag 45538 

57 ENCSLLEEAIRFRESM* 
dplORF094 

B281 atgtacgaattagttctatcactaaaattgacgccgacagcgccgatgtctcaagatgtcgaaaagtgcttcaaaaggctcaag 

1 MYELVLSLKLTPTAPMSQDVEKCFKRLK 

8365 tatattcagtggcggcaggtgaatgcattaaaattgcacacggatttgctcttgaacttcctaagggatatgaagcaatcttgc 

29 YIQWRQVNALKLHTDLLLNFLRDM KQSC 

8449 atcctcgttccagtctttttaagaaaactggtctaa 8484 

57 ILVPVFLRKLV* 
dplORF095 

8877 gtgggaaaactacttcagctctcgacattgtcaagaatgcgcaaatggtatttgagcaggaatgggaacagaagactgaagaac 

1 VGKLLQL STLSRMRKWYLSRNGNRRLKN 

8961 tcaaggaaaagctggaaaatgcgcgtgcatccaaagctagcaagactgctgtcaaggaacttgaaatgcaactcgatagtcttc 

29 SRKSWKMRVHPKLARLLSRNLKCNSIVF 

9045 aagagcctcttaagattgtatatcttgaccttgagaatacattag 9089 

57 KSLLRLYILTLRIH* 
dplORF096 

46681 g t gat tcataaattcttcaattt eg 1 1 gaact t at c t gcggt 1 1 ct cct gt t ac caggt tgcatttgactgtctt cgaaagt at 

1 VIHKFFNFVELICGFSCYQVAFDCLRKY 

46597 cttagcaagaggttcaataaccttttcccaattgctaaatatcacgcaggactttccttgctggatacattcctcgacaatttc 

29 LSKRFNNLFPIAKYHAGLSLLDTFLDNF 

46513 gatacatctttcgaacttgcaagacttgacatcttgagtagttaa 46469 

57 DTSFELARLDILSS* 
dplORF097 

39100 atggacgggattgaaatcttgatactgaccgacgtatgctcgtccgctgtcagtatgactaaatccctcaccgtttggactatt 

1 M D G I E I LILTDVCSSAVSMTKS LTVWTI 

39016 agagaaagcgaggtgagtatattgcgaacgtccgtcagctcctgcaggtccaggaattccttgaagcccttgaggaccttgaag 

29 RESEVS I LRTSVSSCRSRNS LKPL RTLK 

38932 accttgaactcctctaggacctgtttcacctatcttggaaactga 38888 

57 TLNSSRTCFTYLGN* 
dplORF098 

43627 gtgaaaatgctccgtgggatgctaaacgaggcgacatcttcatctggggacgcaaaggtgctagcgcaggcgctggaggtcata 

1 VKMLRGMLNEATSSSGDAKVLAQALEVI 

43711 cagggatgttcattgacagtgataacatcattcactgcaactacgcctacgacggaatttccgtcaacgaccacgatgagcgtt 

29 QGCSLTVITSFTATTPTTEFPSTTTMSV 

43795 ggtactatgcaggtcaaccttactactacgtctatcgcttga 43836 

57 GTMQVNLTTTSIA* 
dplORF099 

38298 atgcaagttcgccatctgctactgaagctccagctggtggatggtctacgcaagttcctaccgtcccaggtggtcagtatttat 

1 MQVRHLLLKLQLVDGLRKFLPSQVVSIY 

38382 ggactcgaacaagatggcgctacactgaccaaactgatgaaattggatattcagtttcaagaatgggcgagcagggtcctaaag 

29 GLEQDGATLTKLMKLDI Q FQEWASR^VLK 

38466 gtgacgcaggtcgtgacggtattgcaggaaagaacggaatag 38507 

57 VTQVVTVLQERTE* 
dplORFlOO 

1597 atgcagttgacaccaagcgagttctatttggatttagaactacggctgagaatatgtcaagattccttacctggactctcacgg 

1 MQLTPSEFYLDLELRLRICQDSLPGLSR 

1681 agcttatgtggaagcatgctcgtatcgactctatcaaactatgggaaactcctacaggttgcgcagaatgtacttactacgaga 

29 SLCGSMLVSTLSNYGKLLQVAQNVLTTR 

1765 ttttcacagaagacgagattgaaatgttcaagaacgtaa 1803 - " 

57 FSQKTRLKCSRT* _ 
dplORFlOl 

19220 gtgataattttagtccagttcccactacatttgaaagcgcgattaggtcatctaggctgtctagctcgagttcgattacaaggt 

1 VII LVQF PLHLKARLGHLGCLARVRLQG 

19304 tgccagtatcaatttcacaaaagtaagcgacatttccaactttctctagtgcttcacgatacctatcacatgtcgcctcttcgt 

29 CQYQFHKSKRHFQLSLVLHDTYHMSPliR 
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193 SB caaatagtcgcgcagaataaactccgaatttcattttag 19426 

57 QIVAQNKLRISF* 

dplORFl02 ta tgggaatgtt . tgact g tat:C g CC g aactcg a t aaaattcctggtgtatttagacagcctaagacacgtgaacagc 

1 MITWECLTV S PNS I KFLVYLDSLRHVNS 

4118 ttttggaagcaccacaaatttcttgggataattatctatacatgcgcgagcgaatggttgagaaagacaagctcttacctattt 

29 pw K HHKFLGIIIVTCASEWLRKTSSYLF 

4202 tccatatgggagaagactttaaatggctcaacttga 4237 

57 SIWEKTLNGST* 

49352 7103 ttgaatcatagatatagtaacatcacaactatttttctttggcagattgtctttctctgtatttgctgcgcggtgtcctattgt 

1 L N HR YSN ITT I FLWQIVFLC I CCAVSYC 

49436 gcaggagtgcataatgagcgagagcctcaagataaggtgattcaaagttataagcagaaagaaaagtcagccgtctacttgaca 

29 AG V HNERESQDK VIQSYKQKEKSAVYLT 

49S20 gtcgatagttcaggagcttggctaggaagtgctccgggagccaaggaaagtcctctctacaatgaaaagggacagcatgtagga 

57 VDS SG AWLGSAPGAKESPLYNEKGQHVG 

49604 aaattgaaagaggtgggagagtga 49627 

85 KLKEVGE* 

atgagaaaaagagtgattttgaagctaaaaaggttgaactggtatgtccttaattcctactctcgaatggttgagtttttcgaa 
1 M R KR VILKLKRLKWYVLNSYSRMVEFFE 

21343 cttttgaacttttcgaatggttcgacctttcgaaggattgaggttttcgaaccggttgagtttttcgagcattctcgacttttc 
23 LLNFSNGSTFRRIEVFEPVEFFEHSRLF 
21259 gacccctttctatgctcgacttttcgagtgttttga 21224 
57 DPFLCSTFRVF* 
dplORF105 

2028 atgatagccgcatccaccagttcgaatgaaaatagtcttttgacctataaccactccttcaccttgaattgtaggaccgaaaat 
1 MIVASTSSNENSLLTYNHS FTLNCRTEN 

1944 ttccatqataggcattttctcagggtcgcgaacattgattcgaatcttgcctctttcaggccgattgtattgattaaccattat 
29 fhDRHFLRVANIDSNLASFRLIVLINHY 
1860 cctgctcctgccctaaaatttcgcggacagtaa 1828 

57 PAPALKFRGQ* 

lOstsT 106 atgaacctcgtcaatgatgtaaactttgaactcgctgtccatagacttgtatctagaatcttcaataatgtttcgaacattttc 
x m nlVNDVNFELAVHRLVSRIFNNVSNIF 

10445 taccccattattagaagcagcatcaatttcaataggagagccaagtcctttgttcacatccttcgcgaaaattcgagcagtagt 

29 ypiirssinfnrraksfvhilrenssss 
10361 ggttttaccagttccagcgccaccacagaatag 10329 
57 GFTSSSATTE* 

lO^SO 7107 atgagcgtgacgccctttcgtttattgggaaacttgcaaatggaggaatgcgtgacagtatcacaaggctcgaaaaagtccttg 
i msvtpfrllgnlqmeecvtvsqgskksl 
10834 attatagtcatcacgttgacatggaagccgtttctaatgcactag 10878 
29 iiVITLTWKPFLMH* 

49447 ?1 ° 8 atgcactcctgcacaataggacaccgcgcagcaaatacaaagaaagacaatctgccaaagaaaaatagttgtgatgttactata 
1 M H SC TIGH RAANTKKDNLPKKNSCDVTI 

49363 tctatgattcaatttcgcttacctccaatccccttacattgcttgcctgaaaatctagaaccactgaagtatcatatatacgac 
29 sMIQFRLPPIbLHCLPENLEPLKYHIYD 
49279 tacaaagcctttggcctaaaaggtcaataa 49250 
57 YKAFGLKGQ* 
dplORF!09 

31632 atgtggttgtcgaagtcccaaatagttgattctccttcaactttccagcctttgaaagccttacctgttaaggtagggtcaact 
1 MWLSKSQIVDSPSTFQPLKALPVKVGST 
31548 ggttttggagaaatcttcttacctgcttcaactcgaactgcgccggcggttcctgttccaccgttcaaatcgaatgtcacgcga 
29 g pg eiflpastrtasavpvppfksnvtr 

31464 cgaagaaccgctggaagttgtgccacatag 31435 
57 RRTAGSCAT* 
dplORFHO 

16444 atgatttcaactctagcatcaacctccatgtcgcgagtaagtgtgactccagtttcagcgacaggacatgctttgaatactgca 
i misilastshsrvsvtpvsatghalnta 
1652B atgtcaagttcgctctttccaataactgagcctaggcctaagtacaagttaggattgattccagtgaccttatattgtttctca 

29 mssslfliteprskyklglipvtlycfs 

16612 gtttcttttacaggaatgctttcatag 16638 

57 VSFTGMLS* 
dplORFlll 

28657 gtgactctatcaagaaagctctcgcaattggtgttcaaggttcttgggaaaacttctcgcttcttgcaagtgacgctgagaaac 

1 V TLSRKLLQLVFKVLGKTSCFLQVTLRN 

28741 tcatcgctgaaaaaacaggtctccaaatcgctgtccactccaagaaaattgctcagttcgctgacgctgacaaacttcctgacg 

29 S S L KKQVF KS LST LRKL LS S LT1*T JT- P~ L T 

28825 ttggtaacattcgtcagttcaacttga 28851 — 

57 LVTFVSST* 
dplORF112 

32207 atgcaaactgatttaggcaaatactgcttcgacgcagcagccgttgcttatattagatatttgcaggaagacaagactcctagg 

1 m qtdLGKYCFDAAAVAYIRYLQEDKTPR 

32291 tatcctggtgacgaaaagaaaaacccaggattgcaaatgctcatggagtga 32341 
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29 YPGDEKKNPGLQMLME* 
dplORF113 

17715 atgaaaacagttaaagaagcaatcaaacaattcggtgatgaatggtggtacgaaattatcaacgaaaacggccaaatgattcaa 

1 MKTVKEAIKQFGDEWWYE I INENGQMIQ 

17631 gacggaagaatcgaagacatgggcgaatacatggaagaaacggtcgaccaagttaagttcatcaactatggtgacatcgaatct 

29 DGRIEDMGEYMEETVDQVKFINYGDIES 

17547 caaattatcaaactatatatcgcataa 17521 

57 QIIKLYIA* 
dplORFll* 

52952 atgctattggcgaagacggggaaacagtccatcccgataattgtccattatgccaaaacggattccctcgtactgaaaaactat 

1 MLLAKTGKQSILI IVHYAKTDSLVLKNY 

53036 ttcttcaactttacaaccatgatacgggaaaagttgaaacatgggaccgaggccgttcttatgttcaaaagattgttacattta 

29 FFNFTTMIREKLKHGTEAVLMFKRLLHL 

53120 tcaataaatatggaagccttgtga 53143 

57 SINMEAL* 
dplORFUS 

5342 atgagcctcctttttttgatatatataatatacacgaatcatcgcgagtttgtaaagccgtttctaaataattttaaatctttt 

1 MSLLPLIYI IYTNYREFVKPFLNNFKSF 

5258 aagcatatcgagttttgcttcataagtcccgttcacggcagcctcttgcattttgagtacaatgaaaggaggttcctcgatatt 

29 KHIEFCFISPVHGSLLHFEYNERRFLDI 

5174 gttgaaactatagaaggtgaataa 5151 

S7 VETIEGE* 
dplORFUS 

20662 atgaaattttcaaactttgctaaagcacttactaatgaatacctaatggtagtgaacaatgaccaagctgaagtcttaggcgca 

1 MKFSNFAKALTNEYLMVVNNDQAE VLGA 

20578 ggaaatatcgaaaacattctcaacggttcgaactttgctaatgttgtagctgaagcgacagttttaaaactcgaaaaactcagc 

29 GNIBNILNGSNFANVVAEATVLKLEKLS 

20494 gaagaggaagctatcgagtag 20474 

57 SEEAIE* 
dplORF117 

24680 atgataacaggctgctcgaacattttaaatcgaagtgaatctcgtaagtcactaatagttttgttcaagttatctgctactgtg 

1 MITGCSNILNRSESRKSLIVLPKLSATV 

24596 ataaggtctttgacatcgcetgtcccgtatatgtcateagtcaatggttcattaagaataactcgacaaggaatttgcttcaag 

29 IRSLTSLVPYMSLVNGSLRITRQGICFK 

24512 ccggttggggcggattcttga 24492 

57 PVGADS* 
dplORF118 

15023 atgatattatctacgtcgacgcaacttgtgaaactattaaatacgaggagcctattgcatgaacaatcagcgaaagcaaatgaa 

1 MI LSTSTQLVKLLNTRSLLHEQSAKANE 

15107 caaacgaatcgtcgaacttcgcgaagactatcaacgtgcaagaggtcgaataaacttccttcttgctgtaaaggaccacggcga 

29 QTNRRTSRRLSTCKRSNKLPSCCKGPRR 

15191 agaactcgaaaaccttga 15208 

57 R T R K P * 
dplORF119 

41054 atggaggttcaacatccccgattcagtacgtcctactttttcgggcatttctttagtagacacgacttcagcggttcgacagat 

1 MEVQHPRFSTSYFFGHFFSRHDFSGSTD 

41138 tttaacagggaacaacttcctccaaatcatgtcgaacattcaagtcaacttcaacaatgcttccggcgcttacggacccactat 

29 FNREQLPPNHVEHSSQLQQCFRRLRIHY 

41222 ccaagcatttcacgctga 41239 

57 P S I S R * 
dplORF120 

28387 gtgttgaagcgcaagcagaatacatgcgtatgcaattgcttcaatacggtaaattcactgtcaaatcaactaacagcgaggctc 

1 VLKRKQNTCVCNCFNTVNS h S NQLTARL 

28471 aatacacttacgactacaacatggatgctaagcaacaatatgcagtcactaagaaatggactaacccagctgaaagtgacccta 

29 NTLTTTTWMLSNNMQSLRNGLTQLKVTL 

28555 tcgctgacattttag 28569 

57 S L T F * 
dplORF121 

39222 gtgcagacggatcacgtgagttcagtttggaagataataatcaacaatatatgggt tat tact ccgattatgagcaagcagata 

1 VQTDHVSSVWKIIINNIWVITPIMSKQI 

39306 gcagggatcgaactaagtatcgatggtttgaccgccttgccaatgttcaagtgggaggtcgaaacgagttccttaattctttat 

29 AGIELSIDGLTALPMFKWEVETSSLILY 

39390 ttgaatttggtttaa 39404 

57 L N L V * 
dplORF122 

40402 atgttattctccttatcctacataccgaatcacgttcatgtctggattaaacgagtattgttccgttctaaatcggccgacttg 

1 MLFSLSYZ PNHVHVWIKRVLFRSKSADL 

40318 aatggattgggtaaagatcccgttatcgatgtgaatgaacccttgcgtaaggtacataacttcattccctgcggagaacataga 

29 NGLGKDPV IDVNE PLRK VHNF I P - C G_^E— H *R 

40234 aattcggtcacttga 40220 ^ 

57 N S V T * 



dplORF123 

21327 atggttcgacttttcgaaggattgaggttttcgaaccggttgagtttttcgagcattctcgacttttcgacccctttctatgct 
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1 MVRLFEGLRFSNRLSPSS I LDFSTPFYA 

21243 cgacttttcgagtgttttgaggttttcgagcaggttcgacttttcgagaaattgagtttttcgacctctaaattaggctcgatt 

29 RLFECFEVFEQVRLFEKLSFSTSKLGSI 

21159 attcgaaaagtttag 21145 

57 I R K V * 
dplORF124 

17891 atggtaaaagttaaagatttgcaagtaggaatgaaagttgtaaatgcaaaaggtactgaattcaaagtaactgaccgtcaaggt 

1 MVKVKDLQVGMKVVNAKGTE F KVTDRQG 

17807 cgtaaatgggtaagcctagaacgtcttagtgatggacgtattcggttctatgataacgaatcactaatggacgaaaaagtggag 

29 RKWVSLERLSDGRIRFYDNES LMDEKVE 

17723 gtagtaaaatga 17712 

57 V V K * 
dplORF125 

49916 atgtcctcagccgcttccgttaaaattggaacaagtgaattatatagatgctcctcttttagcttgtcgataaggtattcatca 

1 MSSAASVKIGTSELYRCSSFSLSIRYSS 

49832 gtttcgccaatttcgaaaaattcgaatccaggaaaatggtcgagaatagtttcgtcgtccggaactcttccatatctcgaaaag 

29 VSPISKNSNPGKWSRIVSSSGTLPYLEK 

49748 tgttcttga 49740 

57 C S * 
dplORF126 

1613S atgagctcaagtacgttttctcgaacaatagggtcaagtccagttatatcaacgaactgtatatcgtcctcttgtataggaata 

1 MSSSTFSRTIGSSPVISTNCISSSCIGI 

16052 aggtctgcgtacagttgcatggctgaccctttaattggagtaactgttccttcactgtttattttaaataaggttatcatttct 

29 RSAYSCMADPLIGVTVPSLFILNKVIIS 

15968 atcctctaa 15960 

57 I L * 
dplORF127 

13511 atgctaaatagctttcccattcaccgtcgctgttcttgcgccatttttcagtttcacgatactgaccaactttgcaaaggtcgt 

1 MLNSFPIHRRCSCAI FQFHDTDQi»CKGR 

13427 gaaatagtgctacgattgcaactgtttccattgggtaaatgtcttcccagcctttgcctaccatggtatccatttcgaaaagta 

29 EIVLRLQLFPLGKCLPSLCLPWYPFRKV 

13343 gttgattga 13335 

57 V D * 
dplOR?128 

4852 atgacagcagttcaacaagttaagttctacttagaagaagccggcgctcactttctaaaagatgttgagtacagtgacaactta 

1 MTAVQQVKFYLEEAGAHFLKDVEYSDNL 

4936 gagcaagc aa 1 1 a t gaaag at a 1 1 c 1 1 a aa t ggaat ggcgc t c at agagatgagc acga t a t g aaaat aac 1 1 c a t a cgaag t a 

29 EQAIMKD X LKWNGAHRDE HDMKITSYE V 

S020 ttatag 5025 

57 L * 
dplORF129 

25133 atgaactttctgctaagcaacttgcgctcactgaagttcaaactaatgtacgcagccaccaatcttacattgaagaattcagta 

1 MNFLLSNLRSLKFKLMYAATNLTLKNSV 

25217 agaaggaaaaggcggacaaggaatgggaacgcattttggaagaacttgctcagcttgacgaaatctcagctggagcattgcctg 

29 RRKRRTRNGNAFWKNLLS LTKSQLEHCL 

25301 tattag 25306 

57 Y * 
dplORF130 

16789 gtgcttgactttattcctttattatcgtataatcataatataaataaaacaagcgtcaaggacgcagaaagaggtcaattatgg 

1 VLDFIPLLSYNHNINKTSVKDAERGQLW 

16705 aaacaacactttatttcggttatcttacagcagattggaaagacggtcacaagaactacactttccactatgaaagcattcctg 

29 KQHFISVI LQQIGKTVTRTTLSTMKAFL 

16621 taa 16619 

57 * 
dplORF131 

43846 atgctcaaccggctgagaagaaacttggctggcagaaagatgctactggtttctggtacgctcgagcaaacggaacttatccaa 

1 MLNRLRRNLAGRKMLLVSGTLEQTELIQ 

43930 aagatgagttcgagtatatcgaagaaaacaagtcttggttctactttgacgaccaaggctacatgctcgctgagaaatggttga 
44013 

29 KMSSSISKKTSLGSTLTTKATCSLRNG* 
dplORF132 

15304 gtgactggaaggtcatctaatacacatagcctcaagacatttcgttggctttcaggaaaacattcgactagattgtcaatgtat 

1 VTGRSSNTHSLKTFRWLSGKHSTRLSMY 

15220 cccacaaaggcttcaaggttttcgagttcttcgccgtggtcctttacagcaagaaggaagtttattcgacctcttgcacgttga 
15137 

29 PTKASRFSSSSPWSFTARRKFI RPLAR* 
dplORF133 

8061 atgacttcttcattcatgacaagttttcgagttcctgcttgcttgtcaggaatagttttcccggcggctaaaatgtatagatta 

1 MTS S FMTSFRVSACLS G I VF PAA'KM _Y; *"R L 

7977 tcgtatttttctttcctgatagcagaacttgaatccatttgtattcccaccatttccgccctatctgcggcgaaataa 7900 

29 SYFSFLIAELESICI PTI SALSAAK* 
dplORF134 

498 atgacttcaatgtacttaggttccatcaattcatacaagtcattcaaaataatgttcatgcaatcttcgtggaagtcaccgtgg 

1 MTSMYLGSINSYKSFKIMFMQSS WKSPW 

414 ttacggaaactgaataagtacaatttcaatgatttagattcaaccaccttttcgtttggaatgtaa 349 
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29 LRKLNKYNFNDLDSTIFSFGM* 
dplORF135 

780 acgaagcagaacctgaaaatgctgctaatgttgcaatgttctacggagtcaagttcaccattcttgaaattgactcgaaaatct 
1 MKQNLKMLLMLQCSTESSS PFLKLTRKS 

664 actcaagctctagctcttccttattacaaggaaaaggcgaaatttcacatggaaaatcttacgctgaaatcctag 938 

29 TQALALPYYKEKAKFHMENLTLKS* 
dplORF136 

S5252 gcgaagaaatcttcaataaccttattcgcttctttgacagatacattcatctgctcagcgattgagttagccccgcggccgtac 
1 VKKSSITLFASLTDTFICSAI ELAPRPY 

55168 ataagacctaaaagaacggacttgacagaatttcttcgaagttttccttccttgttagtcgttccgtcgggatag 55094 
29 IRPKRTDLTEFLRSFPSLLVVPSG* 
dplORF137 

37146 atgcttcgaacttgtttgttagcaccgtcaggaggacaaactagtcgaacccattcacctgcgtctttgataatatctagcgcg 
1 MLRTCLLAPSGGQTSRTHSPASLIISSA 
37062 acagcgcctacagaagaagcaacgtgtttcaacttcctaggcaagccttctgctagttcataccataatgcgtag 36988 
29 TAPTEBATCFNFLGKPSASSYHNA* 
dplORF138 

30662 atgactatatcgaagaacaatgtagtcatccggcctatctgtatcttgctcgtcaaattcaactcctggaagcataggagcagg 
1 MTISKNNVVIRPICILLVKFNSWKHRSR 
30578 cgagagctgaaatgtaggaagaatttccttcaatctgtccatcattgtcgttcgtttagtcatgttcactcctag 30504 
29 RELKCRKNFLQSVHHCRSFSHVHS* 
dplORF139 

12092 atgatactaaatcactcaacttgtttgaccctcctgataaattcgttcacgcagacacgcgcacttgagccctttttagatacc 
1 MILNHSTCLTLLINSFTQTRAFEPFLDT 
12008 tttcgcaaacacctagatgcttccctcactaaaaggtcatgggcctcaagttcttcgaaagacatttctacatag 11934 
29 FR KKLDAS LTKRSWASSS S KD I ST * 

dplORF140 

20562 atgttttcgatatttcctgcgcctaagacttcagcttggtcattgttcactaccattaggtattcattagtaagtgctttagca 
1 MFSIFPAPKTSAWSIiFTTIRYSLVSALA 
20646 aagtttgaaaacttcattttattttccctttatttgtttttctttatactattattatacaataatgattga 20717 
29 KFENFILFSLYLFFFILLLYNND* 
dplORF141 

42922 gtgctaagagttgtagagatatcctctaaaacgctcttggctttattcgatttccattcgaataacttatttagtaggacagta 
1 VLRVVEISSKTLLALFDFHSNNLFSRTV 
42838 agcactccgctgcacgctgtaataatcgtcgtcaagactgctgtgtcgtttagccacattggcatagattga 42767 
29 STPLHAVIIVVKTAVSFSHIGID* 
dplORF142 

31898 gtgactgtcgaagtttctccaaacagttctgtcactttacctaaaagcgtattagggattttcccgttagcgattaggttcatg 
1 VTVEVSPNSSVTLPKSVLGIFPLAIRFM 
31814 acacctgctgctcgaattttaacatggataggttcactaccttttgaaaatcctggaagtgcgatgatttga 31743 
29 TPAARILTWIGSLPFENPGSAMI* 
dplORF143 

7565 atgaagtttgggttgacgcttttaactccagaccgtttaattttttcaaggcttgaaattggataccatataatcttttcatgc 
1 MKFGLTLLTPDRLI FSRLE I GYUI I FSC 

7481 ttttggaaatacactaaaattccggcgagaataaatttgcatccatctgcgcgtgatagctggaaccattga 7410 

29 FWKYTKI PARINLHPSARDSWNH* 

dplORF144 

36517 gtgcaaatcaagcgactaacttatttagatacattaaacgaggcgcattcttcaagattcctaatggaaatccaacaattacca 
1 VQIKRLTYLDTLNEAHSSRFLMEIQQLP 
36601 ttgaataccgagccgatgacgcagcagcttggacctctactcttcccgctcaagttgaactgtttctaa 36669 
29 LNTEPMTQQLGPLLFPLKLNCF * 

dplORF145 

42067 atggaaacagctggagacctaacaagtggaaagaggttctatttaagcaagacttcgaacagaataattggcagaaacttgttc 
1 METAGDLTSGKRFYLSKTSNRI ZGRNLF 

42151 ttcaaagtgggtggaaccatcactcaacctatggcgacgcattctattcgaaaactcttgacggcatag 42219 
29 FKVGGTITQPMATHSIRKLLTA* 
dplORF146 

51484 atgacaaactgcatgattgcatcacctttccagtacggaacctcaagggcgaaacagtattcttcaaccgtcgaagtgttcgtt 
1 MTNCMIASPFQYGTSRAKQYSSTVEVFV 
51568 ctaagcttcaccagtacggtgaagatgaccctaaaacggaatttctttatggccaatatgagcttgtag 51636 
29 LSFTSTVKMTLKRNFFMANMSL* 
dplORF147 

55207 atgtatctgtcaaagaagcgaataaggttattgaagatttcttcaccgagttccctaaagtggcagactatatcatattcgttc 
1 MYLSKKRIRLLKISSPSSLKWQTISYSF 
55291 aacagcaggcgcaggacttgggatatgttcaaacagctaccggtcgaagaagaaggcttcctgatatga 55359 
29 NSRRRTWDMFKQLPVEEEGFLI* 
dplORF148 

28636 gtStttcggttcaagaccattcgagtagggcgaacacctgtacgattttcgatgtcatccattgctgctaaaatgccagcgata 
1 VFRFKTIRVGRTPVRFSMSSIAA"KM-S-""Xl 
28552 gggtcactttcagctgggttagcccatttcttagtgactgcatattgttgcttagcatccacgttgtag 28*84 
29 GSLSAGLVHFLVTAYCCLASML* 
dplORF149 

26474 atgccattgaacttttcgagcataaggattaaccttgccccattgtctcactccagctgtggcggaacggctaatggtagttcg 
1 MPLNFSS IRIKLAPLSHSSCGGMANGSS 

26390 agcaagtcgaagggcattgtattcgagattttgatatttatgagcagcaggtttccctag 26331 
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29 SKSKGIVFEILIFMSSRFP* 
dplORPlSO 

15185 gtggtcctttacagcaagaaggaagtttattcgacctcttgcacgttgatagtcttcgcgaagttcgacgattcgtttgttcat 

1 VVLYSKKEVYSTSCTLIVFAKFDDSFVH 

15101 ttgctttcgctgattgttcatgcaataggctcctcgtatttaatagtttcacaagttgcgtcgacgtag 15033 

29 LLSLIVHAIGSSYLIVSQVAST* 
dplORF151 

28027 atgattatatcaacgcaggggagattgccagctacactcaagcacttccttcaaacgctcttcaataccttggaccaactcttc 

1 MIISTQGRLL ATFKHFLQTLFNTLDQLF 

28111 tccctaatgctcaacaaacagggacagacatttcatggctcaagggtgcaaataatttgccagtaa 28176 

29 SLMLNKQGQTFHGSRVQI ICQ* 
dplORF152 

42235 atgtgcataaaggacttatcgacaaagaggctactattgcagtacttcctgaaggatttagaccgaaagtttcaatgtatcttc 

1 MCIKDLSTKRLLLQYFLKDLDRKFQCIF 

42319 aggctctcaataactcatatggaaacgccactctatgtatatacactgacggaagacttgtggtga 42384 

29 RLSITHMEMPFYVYTLTEDLW* 
dplORF153 

22307 atggtggacaaagggctcaccttcccgaactttcgatatcgtcatagcagacggttccattcgtccaggaaaaacagtatcgat 

1 MVDKGLTFSNFRYRHSRRFHSFRKNS1D 

22391 ggctctttcattttccctttgggccatgacggaattcaacggacaaaactttgccatctgtggtaa 22456 

29 GSFIFPLGHDGIQRTKLCHLW* 
dplORF154 

18446 gtgacaataggccttaagaactgcaaaaaaacctggggcgtctgcacgcgcaacctggagctccttaacagtcatccaaggctg 

1 VTIGFKNCKKTWGVCTRNLELLNSHPRL 

18530 aggtttcttacaaacaatcctaattccttcaaaatagctcttgtccgggtcaatagtgcctaa 18592 

29 RFLTNNPNSFKIALVRVNSA* 
dplORF155 

13S12 atgaatacgaccccgagcaacttacaatgggacatggtgcaaaatctaatttccttcttcaacgtttcattcaactcacgccag 

1 MNTTLSNLQWDMVQNLISFFNVSFNSRQ 

13596 ttgaagctcaagcaattttctggcatatgggagcctatgatattagtccttatgcaaatttga 13658 

29 LKLKQFSGIWEPMILVLMQI* 
dplORF156 

18777 atgctagtatctccatttctgttggncttgctttttagctctgttcagttcagctgcttctcgcgatgcaatagtttcgagaat 

1 MLVS PFLLVLLFSSVQFSCFSRCNSFEN 

18861 atgcctgttcataggctcacaatattccgccaaagatttgccagttatggtggcgtcaattaa 18923 

29 MPVHRLTI FRQRFASYGGVN * 
dplORFl57 

13281 gtgcttgctggacttgagaagaaattggtatcattttcgagccaatccataaggttctcgataccgtcacgattgattgtttct 

1 VLAGLEKKLVSFSSQSIRFSIPSRLIVS 

13197 gttactgctttcttgaagcgttttttaaagtctgtcatattagacccctttcattttctataa 13135 

29 VTAFLKRF LKSVI LDP FHFL * 
dplORF158 

40727 gtgaacgccgttattagggtcaaacgaagcccaaacggacattgtctttgtcccgtcactattgtgaggaacagtcacttctcc 

1 VNAVIRVKRSPNGHCLCPVTIVRNSHFS 

40643 acttgcgagcgttacctcctcgccggacgtgtcgtagtctgggtgactgctatgaacacttga 4 0581 

29 TCERYLFAGRVVVWVTAMNT* 
dplOR7159 

30371 atgatttggtctgcgcttacccaagcagcttctcctttgagtttctgtcgagcattccctgtacggtctgtccaaatagcatgc 

1 MIWSALTQAASPLSFCRAFPVRSVQIAC 

30287 gtctttgcgtattcttccatcttagtagcagcgacttcgcagactgttatgacagcgacttga 30225 

29 VFAYSSILVAATSQTVMTAT* 
dplORF160 

41324 atgggttacagacacgcgaggaaaacaatcgaacgtccaagacgtatctatcaatgttatagaatactatggaccgtctatcaa 

1 MGYRHARKTIERPRRIYQCYRI LWTVYQ 

41408 tttctccgttcaacgtactcgtcaaaatcctgcaattatccaagctcttcgaaatgctaa 41467 

29 FLRSTYSSKSCNYPSSSKC* 
dplORF161 

52175 atgcaaaaaggtttaaatgcttatctcgacatgacattgaaagcattgcattcgagactatttcaaaatgtttggcaacgttca 

1 MQKG LNAY LDMTLKALHSRLFQNVWQRS 

52259 aatcaaaccaaggggccaagttttcaacttaccttacaagactcctcaagaatagaatag 52318 

29 NQTKGPSFQLTLQDSSRIE* 
dplORF162 

13020 atgacagaagttgcggtaaatagcccgcaaaaggtgagagtagttatggtcgggaatattgaatttctcgaatatttaaaaagg 

1 MTEVAVNS PQKVRVVMVGNI EFLEYLKR 

13104 aagtacggaacagaaacttccatcagttatattatagaaaatgaaaggggtctaatatga 13163 

29 KYGTBTSI SYIIENERGLI * 
dplOR7163 

40224 gtgaccgaatttctatgttctccgcagggaatgaagttatgtaccttacgcaagggttcattcacatcgataacgggatcttta 

1 VTEFLCS PQGMKLCTLRKGS FTSIT GS_L 

40308 cccaatccattcaagtcggccgatttagaacggaacaatactcgtttaatccagacatga 40367 - ** 

29 PNPFKSADLERNNTRLIQT* _ 
dplOR?164 

6696 atgtacccccggagaacttcgtgcctaaatgttccagcttcgcccatitgcaattaggttagaatctgcgtcatccacaatagac 

1 MYSWRTSCLNVPASPIAIRLESALSIID 

6612 tcaccgattctttcgaaatacatttttcgaatacatccaccaaccccgctgggcttataa 6553 

29 SPILSKYI FRIHPPTPLGL* 
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dplORF165 

50504 atgagtgaaagctggtcaatccccaccacagatggtctatatttagatatcatgctatctaaaattgcaggggtaaggttcttt 
1 MSESWSIPTTDGLYLDIMLSKIAGVRFF 
50420 cctccaatcataaagggcgtgactaccacaagggaattttcagcctcagtcattgcttga 50361 
29 PPIIKGVTTTREFSASVIA* 
dplORF166 

23519 gtggtcatgctctttaatgactctatcttctcccgtttggctcgctttactgtcccagctgtaagcatagtattcatcaatgtc 
1 VVMLFNDSI FSRLARFTVPAVS IVFINV 

23435 gtgcgtgttgctagggtcgagtgtaaatctattctcagccaagagttcagcgtgaaatga 23376 
29 VRVARVECKSILSQEFSVK* 
dplORF167 

1008 atgcttattcggttggagcttcttacgtcgtatatggtgctcacgcagacgatgcggctggaggtgcttaccctgattgcactc 
1 MLIRLELLTSYMVLTQTMRLEVLTLIAL 
1092 ctgagttctataattcaatgtcaaatgcaatggaatatggaactggaggcaaggtaa 1148 

29 LSSIIQCQMQWNMELEAR* 
dplORF168 

54345 atgagactttttccaggttatattcttcacattgttcagttcctggagtcaagtattgttcttgaaattcatagagttcgaaag 
1 MRLFPGYILHIVQFLESSIVLEIHRVRK 
54261 tttgcaaagggtcataggccgcatacatataggcaacatcaggaggaattaaactaa 54205 
29 FAKGHRPHTYRQHQEELN* 
dplORF169 

459S4 atgaacacagcatcgcgaagagtttcaatgttagtgataaggaagaattcgtcgtggccaccaagcaagtcttctgcccgttta 
1 MNTASRRVSMLVI RKNSSWPPSKSSARL 

45870 gaaactccgtcaatcactaatttcccatctttagtgactcgacttcctaaaatatga 45814 
29 ETPSITNFPSLVTRLPKI* 
dplORF170 

27600 atgatgattgttcttgtgctcctgccgtttgttgagcagcagcaagttgcttaccaaaagagccgatttcacgaggttcgggaa 
1 MMIVLVLLPFVEQQQVAYQKSRFHEVRE 
27516 caeca ccac cga cacgacctggat 1 1 cctaaatttccagtcc egg ctggcgact tag 27460 
29 HHHRHDLDFIiNFQSRLAT* 
dplORF171 

47678 atgtcattttctttcatgtactcttttagagcatcacgaagacttttgacttgtttctccatgtcgcctttggtagcatttaat 
1 MSFSFMYSFRASRRLLTCFSMS PLVAFN 

47594 tcaccggcttcttcaattgcagcgatgaactgtttttcatcttcaaatttcatttaa 47538 
29 SPASSIAAMNCFSSSNFI* 



dplORF172 

10462 atgtttcgaacattttctaccccattattagaagcagcatcaatttcaataggagagccaagtcctttgttcacatccttcgcg 
1 MFRTFSTPLLEAASISIGEPSPLFTSFA 
10378 aaaattcgagcagtagtggttttaccagttccagcgccaccacagaatagatag 10325 
29 KIRAVVVLPVPAPPQNR* 
dplORF173 

32160 atgacattagacatttccttcgtctgtacgaaaggtttcagcttgagtcacttcaccgtacattgcactgaagattgtcataag 
1 MTLDISFVCTKGFSLSHFTVHCTEDCHK 
32076 ttgctcatctgtcatatactcgccgacttcagcgtaagtaggctctaccattga 32023 
29 LLICHILADFSVSRLYH* 
dplORF174 

29766 atgtcccatcagcccttttcattaagattgtcgaaccagcgttcgacttttcatcagtttcaagctgttcttgcttatattggt 
1 MSHQPFSLRLSNQRSTFHQFQAVLAYIG 
29682 cataatagaattgcgccatttgtttccagtagtctgcgtcaccttttagactga 29629 
29 HNRIAPFVSSSLRHLLD* 
dplORF175 

15648 atgcgcgegatgtcatggcagataggcgaggataaagagtgtcgaatagaacgccgcagagcttacgagagcgccaaatacaag 
1 MRVMSWQIGEDKECRIERRRAYESAKYK 
15564 ggcgacggtactacggtggtcctcttgcttacctgtaaccaaataaaccattga 15511 
29 GDGTTVVLLLTCNQINH* 
dplORF176 

43031 gtgataaagacggtaacgttgaatttttctagttccgtcttgaatgacgtcattttggtgattgattgctactgtcgtttggtc 
1 VIKT VTLNFSSSVLNDVILVIDCYCRLV 

42947 aatcccgtcgacctgctgtttaagagtgctaagagttgtagagatatcctctaa 42894 
29 NPVDLLFKSAKSCRDIL* 
dplORF177 

19937 atgaacctaaacagttcgagacttctcaagctgttgggaaagaagcaggtcgaatattttggtgggaacgtgaacttggtcata 
1 MNLNSSRLLKLLGKKQVEYFGGNVNLVI 
19853 ttctcgcgactaattttaggtgcttttgtattaatcagcgtgatatgcgcttga 19800 
29 FSRLILGAFVLISVICA* 
dplORF178 

11924 atgacaactgtcgaccaatttaaaagacagtcgaggaaaagctcaggctcaatttttccttcatcagtctccttaaatttgagc 

1 MTTVDQFKRQLRKSLGSIFPSSV-S L N- »L S 

11840 caattagtaacctttagcgaattgctagcacttgcctcccatattaagtcataa 11787 ^ 

29 QLVTFSELLALASHI ICS * 

dplORF179 

56058 atgggtagggttattccttacctegttgatttgctttatgcaaaacctaccacaatcgcttgtcgtggcttcaggagttgcatt 
1 MGRVI PYLVDLLYAKPTTIACRGFRSCI 

56142 ttggataagtcaaaaagcaagtgtctttatattcgacaagctctcgaataa 56192 
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29 LDKSKSKCLYIRQALE* 
dplORF180 

41176 atgttcgacatgatttggaggaagttgttccctgttaaaatctgtcgaaccgctgaagtcgtgtctactaaagaaatgcccgaa 
1 MFDMIWRKLFPVKI CRTAEVVSTKEMPE 

41092 aaagtaggacgtactgaatcggggatgttgaacctccatccgtttgaatag 41042 
29 KVGRTESGMLNLHPFE* 
dplORF181 

13126 atggaagtttctgttccgtacttcctttttaaatattcgagaaattcaatattcccgaccataactactctcaccttttgcggg 
1 MEVSVPYFLFKYSRNSIFPTITTLTFCG 
13042 ctatttaccgcaacttctgtcataggctgtcctcctttgcttatactgtaa 12992 
29 LFTATSVIGCPPLLIL* 
dplORFl82 

45369 gtgcttgcccatgtttcaataaatagggttcgacctcgcctagctttcgaacgtgctataacgatttcaatcatagcgaagaaa 
1 VLAHVSINRVRPRLAFERAITISIIAKK 
45205 ggtgagaagcttcaatcaattccattgcggtgtcaatatcttcttccttga 45235 
29 GEKLQSIPLRCQYLL.P* 
dplORF183 

13896 gtgattccagcttttggtttttcttcagcctcttcaactttttcttccttaggcgcaggtttcttacgagttgaactcttaggt 
1 VIPAFGFSSASSTFSSLGAGFLRVELLG 
13812 ttttcttcaactacttcttcaacctcagcctcttgttcaactggaccttga 13762 
29 FSSTTSSTSASCSTGP* 
dplORF184 

53330 gtgaacttgccgtcaaccacgtcaaacatttggtcttcgtcgaggtctaaaattagagttccaagaagttcgctcttttctgga 
1 VNLPSTTSNIWSSSRSKIRVPRSSLFSG 
53246 aaatcttcaagagtagcactgtcttccggacgctctggaaggaattcataa 53196 
29 KSSRVALSSGRSGRNS* 



dplORF185 

22522 atgaaattcgagatgttcgaaatgaaaatctacttattattagacactttagaaatggcgaagaaattgtcaactacttctata 
1 MKFEMFEMKIYLLL' DTLEMAKKLSTTSI 

22606 catttggaggaaaagatgagtcgagccaagaccttatacagggggtaa 22653 
29 YLEEKMSRVKT LYRG* 

dplORFieS 

21272 a t gc t cgaa aaac t caac cggt t cgaaaac c t caat c c 1 1 cgaaaag t cgaac c at t eg aaaag 1 1 c aaaagt t cgaaaaac t c 
1 MLEKLNRFENLNPSKSRTI RKVQKFEKL 

21356 aaccattcgagagtaggaattaaggacataccagttcaacctttttag 21403 
29 NHSRVGIKDIPVQPF* 
dplORF187 

34415 atggtcttgttcaatctcttcctactatcattcaagcagctgttcaaattatcactgctttattcaatggtcttgttcaggcac 
1 MVLFNLFLLSFKQLFKLSLLYSMVLFRH 
34499 ttcctacgcttattcaagcaggtcttcaaattttgtcagctctcataa 34546 
29 FLRLFKQVFKFCQLS* 
dplORP188 

35609 atgttcgtaaagcagccggttcgcctcgagtggacttgttcaatacaggaagtgacaaccctaaccaacctcagtcacaatcta 
1 MFVKQPVRLEWTCS IQEVTTLTNLSHNL 

35693 aaaacaatcaaggcgagcaaaccgttgccaacattggaacaatcgtag 35740 
29 KTIKASKPLSTLEQS* 
dplORF189 

42587 atgcaaacgcagtatcaaccgtctctgaaactcttcatgacccagacttgtatgctgcgaaccgtcgagaacttcgagctgacg 
1 MQTQYQPSLKLFMTQTCMLRTVENFELT 
42671 agcaaaaacttcgcgaaactcgttacgcaatcgaagatgaaattctag 42718 
29 SKNFAKLVT QSKMKF* 

dplORF190 

39786 atgtatccactcaaagttgttcagtgtggctcaatcatattaaaatcgaacttggtaatatctctactccttttagtgaagcag 
1 MYSLKVVQCGSI ILKSNLVI SLLLLVKQ 

39870 aggaagacctcaaatatcgaattgactcaaaagccgatcaaaagctaa 39917 
29 RKTLNI ELTQKPIKS* 

dplORF191 

40996 atgtccattgttccggaacttgatttaggtaagtaccttgctaagtccagtgacggcgtaaaggatacgctagtagtatggttc 
1 MSIVPELDLGKYLAKSSDGVKDTLVVWF 
40912 ttacctaaatctatccagtcgctaccgaaaactcggtaccaaacttga 40865 
29 LPKS IQSLPKTRYQT* 

dplORF192 

2920 atggtcgacgtcgaatgttttttcgagatgaagtttagggtcttctcgataccctacggtatgttcagcgagtgctccaacaaa 
1 MVDVECFFEMKFRVFSI PYGMFSECFNK 

2836 acggaatggagtatcttgcaacccgtcacgttctgcgtcctcgcctaa 2789 

29 TEWS ILQPVTFCVLA* 

dplORF193 ... ... 

42456 atgattt cage tcaaatt aaac acgaaatgagacattgtctaaatttaac'caagaatt at ctacattcgatttcaceacaagtc 
1 MlSAQIKYEMRHCLNLTKNYLHSLrSPQV 
42372 ttccgtcagtgtatatacatagaatggcatttccatatgagttattga 42325 
29 FRQCIYIEWHFHMSY* 
dplORF194 

40284 atgaacccttgcgtaaggtacataacttcattccctgcggagaacatagaaattcggtcacttgataccttaatggtagagcta 
1 MNPCVRYITSFPAENIEIRSLDTLMVEL 
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40200 ccgtcgttcttaccgataattagaccttcattagaagagctcatgtaa 40153 
29 PSFLPI IRPSLEELM* 

dplORFl95 

42584 acgctcacaaccgttgttttgacaagtttcttttcagctccttgtccaatagtgaactctgccacaatttggcgcgattttgta 
1 MFXIV VLTSFFSAPCPIVNSATIWRDFV 
42500 aggttcaacatagttctcacctcctttctaaaaaatattataacatga 42453 
29 RFNIVLTSFLKNI IT* 

dplORP196 

11273 atqqtaqatttaacaagtccctgtccaatcatgtcactcctccttgctcatcaaaagaagtttggtttcaattatcggtttagc 
1 M V DLTS PCP IMS LLLAHQKKFG FNYRFS 

11189 attaggctcccatttaacaactccagcaagttcatccatttcttccag 11142 
29 IRLPPNKSSKFIHFF* 

dplORF197 bb fc 

7484 atgaaaagattatatggtatccaattccaagccttgaaaaaattaaacggtctggagttaaaagcgtcaacccaaacttcatcg 

1 MK RLYGIQFQALKKLNGLELKASTQTSS 

7568 acgcagggtatgaagttccttacaagaagcgtcgaactagattga 7612 

29 MQGMKFLTRSVELD* 



24119 acgccgctcaacaaattgacgtccagttctattcaatgcctcagttcacctatacagttgaccctagaaacccttccagcttgc 
1 mplnkLTSSFIQCLSSPIQLTLETLPAC 
24203 tttctgttgacattgtttatcaggacgagcgtacaaaaggaatga 24247 
29 FLLTLFIRTSVQKE* 
dplORF199 

15742 gtggctcctgaattaggctgtacttttcctcccaactgcttagcaactgccttctcttgtttagcactagctctgcgcgtggga 
i vapelgctfppwclatafsclalalrvg 
15658 attggtttgtatgcgcgtgatgtcatggcagataggcgaggataa 15614 
29 IGLYARDVMADRRG* 
dplORF200 

47843 atgacaggcttgtattcgataagccctgaaagtttttcacacatttcttccgtctcggcttcgtcaactaatttttcgataatt 
1 mtGLYSISPESFSHISSVSASSTNFSII 
47759 tctttcaagcgttcttcgtccatagttgagcgctctgtcgtgtag 47715 
29 SFKRSSSIVERSVV* 
dplORF201 

38569 atgggcttcacaagttccttctttaatcaaaggtcaatatctttggactcgaactatttggacctataccgattcaactaccga 
1 HGFTSSFFNQRS ISLDSNYLDLYRFNYR 

38653 aacgggctatcaaaaaacctacattccaaaagacgggaatga 38694 
29 NGLSKNLHSKRRE* 
dplORF202 

44483 gtggggcgtttattttttataaaaattttttacaaaatgcttgacaacattcactcattatcgtataatacaattataaaaata 
1 VGRLFFIKI F Y KMLDNIHSLSYNTIIKI 

44567 aataaagccgaaaggcgaggaggacattatgtcaaaaattaa 44608 
29 NKAERRGGHYVKN* 
dplORF203 

22781 ataattaqqattggccgggttacaagagaaccacattttcgaacctgttacggaacagcgccccgtcgcttggttgacaaacga 
1 v i R I G RVTREPHFRTCYGTAP CRLVDKR 

22697 ttcaggcatcagtgccacctcatcacagaagatacctgctaa 22656 
29 FRHQCHLITEDTC* 
dplOR7204 

1471 atgaccacggttcgagtcaagggatggttgttgacttctatcacgtcaagaaaatcgcaggtacactcattgacagacttgacc 
1 MT TVRVKGWLLTFITSRKSQVHSLTDLT 
1555 acgctgttcttcttcaagggaatgaaccaatcgctttag 1S93 

29 TLFFFKGMNQSL* 
dplORF205 

8524 gtgacactgatgaatggttctcagtttggtatgctactcgtgacgcagatatcttctacgaccaaagaattgcccaacttagaa 
1 VTLMNGSQFGMLLVTQI SSTTKELPNLE 

860B ttcaggaaaagcaacctgctatcaagttcaatttcgtag 8646 

29 FRKSNLLSSSIS* 
dplORF206 

19855 atgaccaagttcacgttcccaccaaaatattcgacctgcttctttcccaacagcttgagaagtctcgaactgtttaggttcatc 
1 MTKFTFPPKYSTCFFPNSLRSLELFRFI 
19939 aaattgttcaacttgagcaagtgcgatattattctttag 19977 
29 KLPNLSKCDI IL* 

dplORF207 

27502 gtgtcggtggtggtgttcccgaacctcgtgaaatcggctcttttggtaagcaacttgctgccgctcaacaaacggcaggagcac 
1 vSVVVFPNLVKSALLVSNLLLLNKRQEH 
27586 aagaacaatcatcattctttaaataataggaggaactaa 27624 

29 KNNHHSLNNRRN* ... . . 

dplORF208 " — - *" 

47279 atgttcggtatgaagcaaaagacttcgctgaagaaaataacattcacttcccgtttgttcttcctgaacctagaacagaccttg 

1 MFGMKQKTSLKKITFTSRLFFLNLEQTL 

47363 accatcgtggttctcgattctgggatgacgaaggcgtga 47401 

29 TIVVLDSGMTKA* 

dplORF209 

29784 atgttaagaatcaagttcgtagagccattgaaaccgctcctactaaaatcaaggtacttcgaaactcttgggtcagtgatggat 
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1 MLRIKFVEPLKPLLLKSRYFETLGSVMD 
29868 atggaggaaagaaaaaggataaagcgaatgaagtcgtag 29906 
29 MEERKRIKRMKS * 

dplORF210 

53077 atgtttcaacttttcccgtatcatggttgtaaagttgaagaaatagtttttcaatacgagggaatccgttttggcataatggac 
1 MFQLFPYHGCKVEEIVFQYEGIRFGIMD 
52993 aattatcaggatggactgtttccccgtcttcgccaatag 52955 
29 NYQDGLFPRLRQ* 



dplORF211 

20959 gtgctcgacttttatgtcgcccctaatttttgtttttacttacggactatgggatttgtaggtattttcagggcgcttttttat 
1 VLDFYVAPNFCFYLRTMGFVG IFRALFY 

20875 ttacttattaagtccttttctatattagattgtttataa 20837 
29 LLIKSFSILDCL* 
dplORF212 

52983 atggactgtttccccgtcttcgccaatagcattgcaattgatatagcgtcgacgaccgtcaacgtccgcttcgtggactacgaa 
1. MDCFPVFANSIAIDIASTTVNVCFVDYE 
52899 ataatccatgtcttcgccttccgggtcatcatacaatag 52861 
29 IIHVFAFRVIIQ* 
dplORF213 

30291 atgcgtctttgcgtattcttccatcttagtagcagcgacttcgcagactgttatgacagcgacttgaaacttgtttcgataccg 
1 MRLCVFFHLSSSDFADCYDSDLKLVSIP 
30207 ttcacagttactaacaaattcttcaggcttccatactaa 30169 
29 FTVTNKFFRLPY* 
dplORF214 

24273 atgatgccaaagttgtttttcagtgctcattccttttgtacgctcgtcctgataaacaatgtcaacagaaagcaagctggaagg 
1 MMPKLFFSAHSFCTLVLINNVNRKQAGR 
24189 gtttctagggtcaactgtataggtgaactgaggcattga 24151 
29 VSRVNCIGELRH* 
dplORF21S 

35822 atgttaccaaaccctgatagagtttctttacttctattatacaatcctctcgacagtttgtcaacgtcgtcattgtttcgaact 
1 MLPNPDRVSLLLLYNPLDSLSTSSLFRT 
35738 acgattgttccaatgttgacaacggtttgctcgccttga 35700 
29 TIVPMLTTVCSP* 
dplORF216 

32849 atggcctcggagctcgcggccacatctcctccagatacggcagccaggtcaagtacccctggcatagcgtccatgatttcattt 
1 MASELAATSPPDTAARSSTPG IASMISF 

32765 acctggaaaccggctgaagctagattttccataccttga 32727 
29 TWKPAEARFSIP* 
dplORF217 

23443 atgaatactatgcttacagctgggacagtaaagcgagccaaacgggagaagatagagtcattaaagagcatgaccactgcatgg 
1 MNTMLTAGTVKRAKREKI ESLKSMTTAW 

23527 ataggaacagatatgcccgtctcactgacgctctaa 23562 
29 IGTDMPVSLTL* 
dplORF218 

22029 atggaatgcttccggaagaggttcgatatagactacaaattgagcgcgagaaaattacattgctccgggccaaaatgggcgacc 
1 MECFRKRFDIDYKLSARKLHCSGPKWAT 
22113 aggaaattgaaggcgaggttaaagataacttcgtag 22148 
29 RKLKARLKITS* 
dplORF219 

51388 atgattttatgctcgactttttcagttctcccatttcttcgaaacgcttcagggctgacgccttgcctaactacttcgctagat 
1 MILCSTFSV LPFLRNASGLTPCLTTSLD 

51304 gttccaaaattccttttcagccactggtttccatag 51269 
29 VPKFLFSHWFP* 
dplORF220 

6334 gtgaagttttcttcggtgacggttgatacaatttccctcaagagcaagctgttaaggtggcaagtgaattctttcttcgaaact 
1 VKFSSVTVDTISFKSKLLRWQVNSFFET 
6250 ttcttgccagcagatgcgtacatgatgtcttcataa 6215 

29 FLPADAYMMSS* 
dplORF221 

43507 atgactgctcaagttctatgtactatgctctccgctcagccggagcttcaagtgctggatgggcagtcaatactgagtacatgc 
1 MTAQVLCTMLSAQPELQVLDGQSILSTC 
43591 acgcatggcttattgaaaacggttatgaactaa 43623 
29 THGLLKTVMN* 
dplOR7222 

13212 gtgacggtatcgagaaccttatggattggctcgaaaatgataccaatttcttctcaagtccagcaagcactcgataccatggaa 

1 VTVSRTLWIGSKMI PI SSQVQQALDTME 

13296 gctatgaaggtggacttgtcgagcactcattaa 13328 .... . 

29 AMKVDLSSTH* " 

dplORF223 - 

14055 atgtggtggtacctgctggatatgttcgagatgtctactacttctacagtgaagtcgctgacgtttactacaagaaagatgtcg 

1 MWWYLLDMFEMSTTSTVKSLTFTTRKMS 

14139 acgagcctgacgatgacagcgacattcttgtag 14171 

29 TSLTMTATFL* 
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dplORF224 

13621 atgccagaaaattgcttgagcttcaactggcgtgagttgaatgaaacgttgaagaaggaaattagattttgcaccatgtcccat 
1 MPENCLSFNWRELNETLKKEIRFCTMSH 
13 S3 7 tgtaagttgctcagggtcgtattcatatgctaa 13505 
29 CKLLRVVFIC* 
dplORF225 

32991 gtgagcaacgggtgcgacgtatttcatcgcctctgccacgtcgctagtttctgcgttcgtatcagctgctgctcgagcaaatac 
1 VSNGCDVFHRLCHVASFCVRISCCSSKY 
32907 gtcagccacgtgacccgcctggtttgcctctaa 32875 
29 VSHVTRLVCb* 
dplOR*226 

25191 gtggctgcgtacattagtttgaacttcagtgagcgcaagtt get tagcagaaagt teat cgctaggaattggatagtggtgttc 
1 VAAYISLNFSERKLLSRKFIARNWIVVF 
25107 gatagtcattgtcgtaagtgtttgataacttga 25075 
29 DSHCRKCI*IT* 
dplORF227 

23115 atgactcaattagatggtagcgcttatgacgtttcgagaatccataaaggccgaaggttgttgcattatagataccaaagtcgc 
1 MTQLDGSAYDVSRIHKGRRLLHYRYQSR 
23031 ctgctacgaataaacggtcgaattctatattga 22999 
29 LLRINGRILY* 
dplORF228 

10450 atgttcgaaacattattgaagattctagatacaagtctatggacagcgagttcaaagtttacatcattgacgaggttcatatgc 
1 MFETLLKI LDTSLWTASSKFTSLTRFIC 

10534 tttcaaccggagcatttaatgcgctgttga 10563 
29 FQPEHLMRC* 
dplORF229 

27634 atgtgcgagttaagaaaactgattttaatcaaaccactcgaagcattgtcgcaattcctgaccactacgttgctttggctgctc 
1 MCELRKLILIKPLEALSQFLTTTLLWLL 
27718 aaattccagctaccgcagcaactcaagtag 27747 
29 KFQLPQQLK* 
dplORF230 

50723 gtgacgaaaaatccggcatacttgaactatctgtcgttaaaaaccgatatggcgaagaccgaaaaatcatcgaatatatgtggg 
1 VTKNPAYLNYLS LKTDMAKTEKSSNICG 

50807 acgttgaaactggaacctatactcttatag 50836 
29 TLKLEPILL* 
dplORF231 

31071 atgcgcgtgtcattgcgtttcacatcttcagttccctccgaggtcacggcttcgagttctgctgtttctgccgtatctacgaca 
1 MRVSLRFTSSVPSEVTASSSAVSAVSTT 
30987 aagttagctcogccgacttttggcaactga 30958 
29 KLAPPTFGN* 
dplORF232 

29385 atgtcaattccattagctcttgctaattcaacgagctcaggaacggttttagccgcatactcttcgcgcatttgttcaacttcg 
1 MSIPLAIiANSTSSGTVLAAYSSRICSTS 
29301 tcaatttcttcaactgattcaattgtttga 29272 
29 SISSTDSIV* 
dplORF233 

52892 atgtcttcgccttccgggtcatcatacaatagagtgacaattgcgctgtcaccgtggtcagcgagtgtgaaaaactcgttatta 
1 MSSPSGSSYNRVTIALSPWSASVKNSLL 
52808 gaccctgagctaaatgttcctgatttttga 52779 
29 DPELNVPDF* 
dplORF234 

36253 atgcttacgagtacagcgactcaactgttcgaaaggtttataagtttcaacccgctttgggaggcgatagcttacctaacccag 
1 MLTSTATQLFERFISFNPLWEAIAYLTQ 
36337 gaagacctactcgacaatttagagtag 36363 
29 EDLLDNLE* 
dplORF235 

32768 atgaaatcatggacgctatgccaggggtacttgacctggctgccgtatctggaggagatgtggccgcgagctccgaggccatgg 
1 MKSWTLCQGYLTWLPYLEEMWPRAPRPW 
32852 ctagttcacttcgagcctttggattag 32878 
29 LVHFEPLD* 
dplORF236 

37528 atgttcgtcgcttttagatttagcaatatatcgaggcttcatgtggcgtgtagtaaaccacgaaacatcaatgagatattcact 
1 MFVAFRFSNI SRLHVACSKPRNINEIFT 

37444 tccattgttgatagaagcaaacgttaa 37418 
29 SIVDRSKR* 



dplORF237 

1678 gtgagagtccaggtaaggaatcttgacatattctcagccgtagttctaaatccaaatagaactcgcttggtgt£a_acrgcattt 
1 VRVQVRNLDIFSAVVLNPNRTRLV^STAF 
1594 gctaaagcgattggttcattcccttga 1568 

29 AKAIGSFP* 
dplORF238 

1301 atgcctttttgcggtcgatacaagttgcgcaagttccacaactttcagcgtcactttcataacatgaacgagtcaagaaataag 
1 MPFCGRYKLRKFHNFQRHFHNMNESRNK 
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1217 gaacatctaaatcaattccccatttaa 1191 

29 EHLNQFPI * 

dplORF239 

26521 atggtgaagtattccctatcgaagaatgtcctttcgaccatcccaatggaatgtgctaccaaactgtatggtacgaaaactcac 
1 MVKYFLSKNVLSTILMECATKLYGTKTH 
26605 t cgaagaaat cgctgat gagt tga 26628 
29 SKKSLHS* 
dplORF240 

41893 atgtttggaataagcgtgaaacagagtttacatggcgaagtaacaaatacgaggacaaccctacgggaacccgaggtgaatggg 
1 MPGISVKQSLHGEVTNTRTTLRELEVNG 
41977 gactatttcaaaatttctggttag 42000 
29 DYFKISG* 
dplORF241 

47020 gtgtctttccttaatatggagatagttctcattctatttaagcaggatatcgaaaaggttaccaattttagatttcataggctt 
1 VSFLNMEIVFI LFKQDIEKVTNFRFHRL 

46936 accatctacgatataatctgctaa 46913 
29 TIYDIIC* 
dplORF242 

41338 gtgtctgtaacccatgctcttacggtagcggagccattaaagttcatcatacccaatttgccgccgttttcgttgatagcttgg 
1 VSVTHALTVAEPLKFI I PNLPPFSLIAW 

412S4 tttttacctacgagctcagcgtga 41231 
29 FLPTSSA* 
dplORF243 

51306 atgttccaaaattccttttcagccactggtttccatagaaccctccatcgtttcgacctaatacattcgagacgaatccagtta 
1 MFQMSFSATGFHRTLHRFDLI KSRRIQL 

S1222 gtcctgaagtgtagccgcaagtga 51199 
29 VLKCSRK* 
dplORF244 

27083 gtgaggtacaaaatgttgaccgtcgccgtcaatgaaaattttagcatcgagttctttcgaagttttcgaaataatttccttcac 
1 VRYKMLTVAVNENFSIEFFRSFRNNFLH 
26999 ctgtttgatagttggttcatctag 26976 
29 LFDSWFI * 

dplORF245 

6278 gtggcaagtgaattctttcttcgaaactttcttgccagcagatgcgtacatgatgtcttcataactgctagtagaagtcttaat 
1 VASEFFLRNFLASRCVHDVF I TASRSFN 

6194 tcgaagtcggtctttcaagaataa 6171 

29 SKSVFQE* 
dplOR7246 

2831 atggagtatcttgcaacccgtcacgttctgcgtcctcgcctaatagaccaaaaagtctttgaacggctgcctcagtattgtcca 
1 MEYLATRHVLRPRLIDQKVFERLPQYCP 
2747 aggttacaatttcatccggcttaa 2724 

29 RLQFHPA* 
dplORF247 

29641 gtgacgcagactactggaaacaaatggcgcaattctattatgaccaatataagcaagaacagcttgaaactgatgaaaagtcga 
1 VTQTTGNKWRNS1MTNISKNSLKLMKSR 
29725 acgctggttcgacaatcttaa 29745 
29 TLVRQS* 
dplORF248 

53560 gtgcaaagcctcgttctagcaagaagaacgatgctcagttacttgctcaacggaaaaacaggaagcctgcagttgaggttactt 
1 VQSLVLARRTMLSYLLNGKTGSLQLRLL 
53644 acatttcaggaaacgctctaa 53664 
29 TFQETL* 
dplORF249 

2012 gtggatgcgactatcattgcaactggtgtgactcagcctttacctggaacggtactactgagccggaatatatcacaggcaaag 
1 VDATIIATGVTQPLPGTVLLSRNISQAK 
2096 aagctgctagtcgaatcttga 2116 

29 KLLVES* 

dplORF250 

23837 atgggcaaacatggaagattgacgaagactcagtcgaccataaacctactcgagaaattcgaaactatattcgacaacttatca 
1 mGKHGRLTKTQSTINIjLBKFETIFDNLS 
23921 aaaagcaatcacgctttatga 23941 
29 KSNHAL* 
dplORF251 

39205 atggaaataattagtcttaccgtctgcgcctggcctcccgggtatcccttgagctccgtcattccccttccatttcgtccatgt 
1 MEIISLTVCAWLPGYPLSSVIPLPFRPC 
39121 ataggctgcagggtcttttga 39101 

29 I G C R V F * . 

dplORF252 — - "* 

S4771 gtgttgtataggtcgaaaccaattttgcatattttctatatttcaaaagtgcttttgagatatcgttat-eaaaatgctcgacaa 

1 VLYRSKLI LHI FYISKVLLRYRYQNARQ 

S4687 tactttcgcctgttcctctag 54667 

29 Y F R I* F L * 

dplORF253 

56255 atggttgcgtctataatagaaccgatgttgctagacaaagcatttgcaatcttcgagcctaatttattcgagagcttgtcgaat 
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1 M V A S I IEPMLLDKAFAIFESNLFESLSN 

56171 ataaagacacttgctttttga 56151 
29 IKTLAF* 
dplORF254 

48479 atgaacctttcgcttaggttcaatctttttcgaacattttcatatttaacaaaactttcagctaaaaatcgacaaagttcaatg 
1 MNLSLRFNLFRTFSYLTKLSAKNRQSSM 
48395 ttcgactcaatgtttaaataa 4B37S 
29 FDSMFK* 
dplORF255 

9572 atgctttggtcttctcgacgaatgactctactacattccctgcagggtttcgagcagtacgggtcaatgatgcaccgttttcgt 
1 MLWSSRRMTLLHSLQGFEQYGS M M H R F R 

9488 caaggtagtcaccttttctaa 9468 

29 QGSHLF* 
dplORF256 

15289 atgaccttccagtcactaatgcggccgctgaaattggataccactatacatgggttcaccaacttcgagacaaagcagttgaaa 
1 MTFQSLMRPLKLDTTIHGFTNFETKQLK 
15373 cacttgaagaaattttag 15390 
29 H L K K F * 

dplOR7257 

28216 gtgaacgtgctggatttagcaaacaagctactgagatggcattcttccgtgagtctatgcgacttggtgaaaaagaccgtcaaa 
.1 VNVLDLANKLLRWHSSVSLCDLVKKTVK 
28300 acttgcaaatgctattga 28317 
29 T C K C Y * 

dplORF258 

44023 atggaaattggtattggttcgaccgtgacggatacacggctacgtcatggaaacggattggcgagtcatggtactacttcaatc 
1 MEIGIGSTVTDTWLRHGNGLASHGTTSI 
44107 gcgatggttcaatggtaa 44124 
29 A M V Q W * 

dplORF259 

4298 atgactcgactacgaagcataaagacaagtggatggaaagagtattcgaagttattcgaaacagttctaatccagacgttaaga 
1 MTRLRSIKTSGWKEYSKLFETVLIQTLR 
4382 ctcacgcatttgggatga 4399 

29 L T H L G * 

dplORF260 

24746 gtgaccctacttcctcaatcggcggtactggaggcaagcaagctcaagtcacttccatttcaggaaacttcaacttccttccag 
1 VTLLPQSAVLEASKLKSLPFQETSTSFQ 
24830 cggctgaatattatttag 24847 
.29 R L N I I * 

dplORF261 

288 atgaattcacttccctttgccctaaaacaggacagcctgacttcgcgaatgttttcattagttacattccaaacgaaaagatgg 
1 MNSLPFALKQDSLTSRMFSLVTFQTKRW 
372 ttgaatctaaatcattga 389 

29 h N h N H * 

dplORF262 

9408 atgcctattcaactccaggcggaaagatgtggaagcatgcttgtgcagttcgacttaaatttagaaaaggtgactaccttgacg 
1 MPIQLQAERCGSMLVQFDLNLEKVTTLT 
9492 aaaacggtgcatcattga 9509 

29 K T V H H * 



dplORF263 

27052 atgaaaattttagcatcgagttctttcgaagttttcgaaataatttccttcacctgtttgatagttggttcatctagacctttt 

1 MKILASSSFEVFEIISFTCLIVGSSRPF 

26968 aacaagtcttctaattga 26951 

29 N K S S N * 
dplORF264 

6139 gtgaatagtacaaggcggtctaatacgctcaggatttctgctgtagggatagccgcatcatcttcaaactcaattgagtcaagc 

1 VNSTRRSNTLRISAVGIAASSSNSIESS 

6055 tgtgaaacgtcttcataa 6038 

29 C E T S S * 
dplORF265 

4 801 gtgaataaagtcaagcgtttttgtataaaaagttcatttttttttaaaaaaaataagagcgaaaagctcttatctaaaatagtc 

1 VNKVKRFCIKSSFFFKKNKSEKLLSKIV 

4717 gacgttgacgatttttaa 4700 

29 D V D D F * 
dplORF266 

50220 atgcccgttcttccaagcagttgcaagcattttatcaatagtccacgacttaccttgtccaggtcgagccattatgacaatcaa 

1 MP-VLPSSCKHFINSPRLTLSRSSHYDNQ 

50136 atcctcaccaggaagtaa 50119 

29.ILTRK* " 
dplORF267 

47367 atggtcaaggtctgttctaggttcaggaagaacaaacgggaagtgaatgttattttcttcagcgaagtcttttgcttcatacca 

1 MVKVCSRFRKNKREVNVIFFSEVFCFIP 

47283 aacattaatcgtagatag 47266 

29 N I N R R * 
dplORF268 
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12621 atgtcaatttcggtcttgtgcttgacaatggattcaactactgatgcgtcaacctttttcaatcgcgacagcttgtccaattca 
1 MSISVLCLTMDSTTDASTFFNRDSLSNS 
12537 ttgtcaattctagagtaa 12520 

29 L S I L E * 

dplORF269 

53834 gtgaatagtatcgagtccatcagctcctacgtcaatagaacctattccgtcttcaatcattttgtctacatactgctcgagttt 
1 VNSIESISFYVNRTYSVFNHFVYILLEF 
53750 tgcttcctcagtgattaa S3733 
29 C F L S D * 

dplOR7270 

50792 atgatttttcggtcttcgccatatcggttttcaacgacagatagttcaagtatgccggatttttcgtcacgcttcatagcgata 
1 MIFRSSPYRFLTTDSSSMPDFSSRFIAI 
50708 actctgctagcattttga 50691 

29 T h L A F * 

dplORF271 

19739 atgaggctgctttgctctatcttcgttaccgtattgaccgacttcctactcgcgaaccttcctacaagaattcatacctcaaag 
1 MRLLCFIFVTVLTDFLLANLPTRIHTSK 
19655 gctttttgtcagccttag 19638 
29 A F C Q P * 

dplOR7272 

1556 gtggtcaagtccgtcaatgaatgtacccgcgattttcctgacgtgataaaagtcaacaaccatcccttgactcgaaccgtggtc 
1 VVKSVNECTCDFliDVIKVNNH PLTRTVV 

1472 ataagttccgcctgctaa 1455 

29 I S S A C * 

dplORF273 

56256 atggatttcattaggactgagtcctctcggaattggaacggttgcatatatagatattccgtcagccgtactaggccaagttct 
1 MDFIRTESSWNWNGCIYRYSVSRTRPSS 
56340 agttcagtttatcttgcagtcaattgcttcgagatatttgaaaaagtagtcaggaaaattcctgattatcttgcagtcaattgc 
29 SSVYLAVNCFEIFEKVVRKI PDYLAVNC 

56424 ttcgagatatttgaaaaagtagtcaggaaaattcctgattattttttttacaaaaacgcttga 56486 
57 FEI PEKVVRKIPDYFFYKNA* 
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Table 31 



Query= sid| 114822 | lan j dplORFOOl Phage dpi ORF| 36698-40390 1 2 
(1230 letters) 

>gi|928828 (L44593) ORP1904; putative [Lactococcua lactis phage BK5-T] 
Length => 1904 

Score => 427 bits (1086) , Expect = e-118 

Identities « 226/475 (47%), Positives » 281/475 (58%), Gaps = 45/475 (9%) 

Query: 395 AESGKYIGVI^mTKKPSELVPDDFTWIRLEGPKGDAGLPGAPGRDGVDGVPGKSGVGIAD 454 

A+ YIG + P D+TW + +G+ G GA G+DGV GK GVGI 
Sbjct: 820 ADYPSYIGQYTDFIQYDSAKPSDYTWSLI RGNDG KDGATGKDG V AGKDGVGIKT 873 

Query: 455 TAI TYAVS VSGTQE PENGWSEQVPEL I KGRFLWTKTFWRYTDGS HETG YSVAYI GQDGNS 514 

T ITYA+S SGT +P GW+ QVP L+KG++LWTKT W YTD S ETGYSV YI +DGN+ 
Sbjct: 874 TVITYALSSSGTDKPNTGWTSQVPTLVXGQYLWTKT^ 933 

Query: 515 G KDG I AG KDG VG I AATEVMYAS S P S ATEAPAG G W S TQVPTVPGGQ YLWTRTR WRYTDQTD 574 

G DGIAGKDGVGI T + YA ST APA GW++QVP VP GQ+LWT+T W YTD T 
Sbjct: 934 G NDG I AGKDGVG I KKTT I TYAVGTS GTT APASG WNS QVPNVPAG Q FLWTKTVWTYTDNTS 993 

Query: 575 E IGYSVS RMGEQGP KGDAGR DGIAGKNGIGLKSTSVSYGISPTDSAIP-GVWASQVP 630 

E GYSV+ MG +G KGD G +GIAGK+G G+K+T+++Y SP + P G W++ VP 
Sbjct: 994 ETGYS VAMMGVKGDKGDPGNNGTNG I AGKDGKG I KATAITYQAS PNGTTAPTGTWS ASVP 1053 

Query: 631 SLI KGQYLWTRT I WTYTDSTTETG YQKTYI PKDGNDGKNG I AGKDGVGI KSTTITYAGST 690 

+ KG +LWTRTIWTYTD+TTETGY Y+ +GN+G +G GKDG GIK+TTITYAGST 
Sbjct: 1054 P VAKG S F LWTRT I WTYTDNTTETG YA VAYMGTNGNNGHDG F PGKDGTG I KTTT I TYAGS T 1113 

Query: 691 SGTVAPTSNWTSAI PNVQPGFFLWTKTVWNYTDDTSETGYSVSKIGETXXXXXXXXXXXX 750 

SGT P + WTS +P V G +LWTKTVW YTD+TSETGYSV+ +G 
Sbjct: 1114 SGTTP PNNGWTSTVPTVAEGNYLWTKTVWTYTDNTSETCYS VAMMG VKGDKGDP 1167 

Query: 751 XXXXXXXXXXADG RS - QYTHLAFSNS PNGEGF SHTDS GRA YVGQ YQD FN P VHS KD P AAYT 809 

DG+ + T + + SPKG A G + P +K +T 

Sbjct: 1168 GNNGTNG I AGKDGKG I KATAI TYQAS PNGT TAPTGTWSAS VP PVAKGSFLWT 1219 

Query: 810 WTKW KGNDG AQG I PG KPGADG KTNYFH I A YAS S ADG S 84 6 

T W GN+G G PGK G KT I YA S G+ 

Sbjct: 1220 RTIWTYTDNTTETGYAVAY>IGTNGNNGHDGFPGKI)GTGIKTT- -TITYAGSTSGT 1272 



Score « 396 bits (1007) , Expect • e-109 

Identities = 208/449 (46%) , Positives = 260/449 (57%), Gaps » 42/449 (9%) 

Query: 421 IRLEGPKGDAGLPGAPGRDGVDGVPGKSGVGIADTAITYAVSVSGTQEPENGWSEQVPEL 480 

+ + G KGD G PG +G +G+ GK G GI TAITY S +GT P WS VP + 
Sbjct: 1155 VAMMG VKGDKG - - - DPGNNGTNG I AG KDG KG I KATA I TYQAS PNGTTA PTGTWS AS VPPV 1211 

Query: 481 I KGRF LWTKTFWRYTDGSHETGYS VAYI GQDGNSGKDG I AGKDGVG I AATEVMYAS S PSA 540 

KG FLWT+T W YTD + ETGY+VAY+G +GN+G DG GKDG GI T + YA S S 
Sbjct: 1212 AKGSFLWTRTIWTYTDrnTETGYAVAYMGTNGNNGHDGFPGKIXJTGIKTTTITYAGSTSG 1271 

Query: 541 TEAPAGGWSTQVPTVPGGQYLWTRTRWRYTDQTDE I GYSVSRMGEQG PKGDAGR DG I 597 

T P GW++ VPTV G YLWT+T W YTD T E GYSV+ MG +G KGD G +GI 
Sbjct: 1272 TTPPNNGWTSTVPTVAEGRYLWTKTVWTYTDNTS ETGYS VAMMG VKGDKGD 1331 

Query: 598 AGKNG I GLKSTSVSYG I SPTDSAIP-GVWASQVPS LI KGQYLWTRT I WTYTDSTTETG YQ 656 

AGK+G G+K+T+++Y SP + P G W++ VP + KG + LWTRT I WTYTD+TTETGY 
Sbjct: 1332 AGKDGKGIKATAITYQASPNGTTAPTCTWSASVPPVAKGSFLWTRTIWTYTDNTTETGYA 1391 

Query: 657 KTYIPKDGNDGKNGIAGKDGVGIKSTTITYAGSTSGTVAPTSNWTSAIPNVQPGFFLWTK 716 

Y+ +GN+G +G GKDG GIK-fTTITYAGSTSGT P + WTS +P V G +LWTK 
Sbjct: 1392 VAYMGTNGNNGHDG F PG KDGTG I KTTT I TYAGS T S GTT P PNNG WT S TV PTVAEG NYLWT K 1451 



Query: 

Sbjct: 



717 
14 52 



TVWNYTDDTSETGYSVSKIGETXXXXXXXXXXXXXXXXXXXXXXADGRS -QYTHLAFSNS 
TVW YTD+TS ETGYS V+ +G DG+ + T + + S 

TVWTYTDNTS ETGYSVAMMG VKGDKGD PGNNGTNG I AGKDG KG I KATAITYQAS 



775 
1505 
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Query: 776 PNGEGFSHTDSGRAYVGQYQDFNPVHSKDPAAYTWTKW KGND 817 

PNG A G + P +K +T T W GN+ 

Sbjct: 1506 PNGT TAPTGTWS AS V P P VA KG S FLWT RT I WTYTDNTT ETGYA VA YNGTNGNN 1557 

Query: 818 GAQGIPGKPGADGKTNYFHIAYASSADGS 846 

G G PGK G KT I YA S G+ 
Sbjct: 155B GHDGFPGKDGTGIKTT- -TITYAGSTSGT 1584 



Score = 384 bits (977) , Expect » e-105 

Identities = 179/322 (55%), Positives « 222/322 (68%), Gaps » 7/322 (2%) 

Query: 421 IRLEGPKGDAGLPGAPGRDGVDGVPGKSGVGIADTAITYAVSVSGTQEPENGWSEQVPEL 480 

+ + G KGD G PG +G +G+ GK G GI TAITY S +GT P WS VP + 
Sbjct: 1311 VAMMGVKGDKG - - - DPGNNGTNGIAGKDGKGI KATAITYQAS PNGTTAPTGTWSAS VPPV 1367 

Query: 481 IKGRFLWTKTFVmYTIXSSHETGYSVAYIGQDGNSGKIJGIAGKIXJVGIAATEVMYASSPSA 540 

KG FLWT+T W YTD + ETGY+VAY+G +GN+G DG GKDG GI T + YA S S 
Sbjct: 1368 AXGS FLWTRTI WTYTDNTTETG YAVAYMGTNGNNGHDGFPGKDGTG I KTTTI TYAGSTSG 1427 

Query: 541 TEAPAGGWSTQVPTVPGGQYLWTRTRWRYTDQTDEIGYSVSRMGEQGPKGDAGR- - -DGI 597 

T P GW++ VPTV G YLWT+T W YTD T E GYSV+ MG +G KGD G +GI 
Sbjct: 1428 TTPPNNGVrrSTVPTVAEGNYLWTKTVWTYTDOT 1487 

Query: 598 AGKNGIGLKSTSVSYGISPTDSAIP-GWASQVPSLIKGQYLWTRTIVnTTDSTTETGYQ 656 

AGK+G G+K+T+++Y SP + P G W++ VP + KG + LWTRT I WTYTD +TTETG Y 
Sbjct: 1488 AGKDGKGI KATAITYQAS PNGTTAPTGTWSAS VP PVAKGS FLWTRT I WTYTDNTTETG YA 1547 

Query: 657 KTYI PKDGNDGKNGI AGKDGVG IKSTTITYAGSTSGTVAPTSNWTSAI PNVQPGFFLWTK 716 

Y+ +GN+G +G GKDG G I K+TTITYAGSTSGT P + WTS +P V G +LWTK 
Sbjct: 1548 VAYMGTNGNNGHDGFPGKDGTGIKTTTITYAGSTSGTTPPNNGWTSTVPT^ 1607 

Query: 717 TVWNYTDDTSETGYSVSKIGET 738 

TVW YTD++ ETGYSV K+G T 
Sbjct: 1608 TVWAYTDNS FETGYS VGKMGNT 1629 



Score » 201 bits (507) , Expect = 2e-50 ^ 
Identities 121/297 (40%), Positives = 156/297 (51%) t Gaps « 19/297 (6%) 

Query: 421 IRLEGPKGDAGLPGAPGRDGVDGVPGKSGVGIADTAITYAVSVSGTQEPENGWSEQVPEL 480 

+ + G KGD G PG +G +G+ GK G GI TAITY S +GT P WS VP + 
Sbjct: 1467 VAMMGVKGDKG DPGNNGTNGIAGKDGKGI KATAITYQAS PNGTTAPTGTWSAS VPPV 1523 

Query: 481 I KGRFLWTKTFWRYTDGS HETGYS VA YI GQDGNSGKDG I AGKDGVG IAATEVMYAS S PS A 540 

KG FLWT+T W YTD + ETGY+VAY+G +GN+G DG GKDG GI T + YA S S 
Sbjct: 1S24 AKGSFLWTRTI WTYTDNTTETGY AVA YMGTNGNNGHDGF PGKDGTG I KTTTITYAGSTSG 1583 

Query: 541 T EA P AGG WS TQ V PTV PGG Q YLWTRT RWRYTD QTD E I G YS VSRMGEQG P KGDAG RDG I AG K 600 

T P GW++ VPTV G YLWT+T W YTD + E GYSV +MG GP AG +G GK 
Sbjct: 1584 TTPPNNGWTSTVPTVAEGNYLWTKTVWAYTDNSFETGYSVGKMGNTGP AGSNGNPGK 1640 

Query: 601 NGIGLKSTSVS YGI S PTDSAI PGVWASQVPSLI KG -QYLWTRTI WTYTD STTE - - TGYQK 657 

+ T+ G++ S++ ++G+YW W+ G 

Sbjct: 1641 WSDTE PTTKFKGLTWKYSGWDMPLGNGTKI LAGTEYYWNGNNWALYE 1NAHN INGDNL 1700 

Query: 658 TYIPKDGNDGK-NGIAGKDGVGIKSTTITYAGS TSGTVAPTSNWTSAI PNVQ 708 

+ DGK IG+GV +TTGS +S+ TNTAINQ 

Sbjct: 1701 SVTNGTF KDGKI ESI WGSNGV NGTTT I EGSHLQI HS SDSTTNTEN - TLAI DNRQ 1753 



Query- sid| 114823 |lan|dplORF002 Phage dpi 0RF| 32386-35835 | 1 
(1149 letters) 

>dbj|BAA31888| (AB009866) orf IS (bacteriophage phi PVLl 
Length =694 

Score « 280 bits (709) , Expect » 3e-74 

Identities = 157/465 (33%), Positives « 257/465 (54%), Gaps = 28/465 (6%) 

Query: 40 Q IGS ALTGLGKGLTTAVTLPLMGFAAAS I KVGNE FQAQMSRVQAI AGATAEELGRMKTQA 99 

+IG+++ +G+ +T VT P++ A + K G EF M +V+A +GAT EE +K +A 
Sbjct: 151 EIGNSMKNVGRNMTMYVTAPWAGFAVAAKKGI EFDDSMRKVKATSGATGEEFEALKKKA 210 
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Query: 100 IDLGAKTAFSAKEAAQGMENLAS AGFQVNE I KDAM PGVLDLXXXXXXXXXXXXXXMAS S L 1S9 

++GA T FSA ++A+ + +A AG+ ++M+ + GV+DL + L 

Sbjct: 211 IUSMGATTKFSASDSAEALNYMA1AGWDSKQ^EG1£ 270 

Query: 160 RAFGLEANQAGHVADVFARAAAimJAETSDMAEAMKWAPVAHSMGI^LEETAASIGIMA 219 

AFGL+A +GH+ADV A+ ++ N + + EA KYVAPVA ++G ++E+T+ +IG+M+ 
Sbjct: 271 TAFGIJCAKDSGHIJU3VIAQTSSKANTDVRGLG 330 

Query: 220 DAGI KG SQAGTTIiRGALSRI AKPTKAMVKSMQELG VSFYDANGNMI PLREQ I AQLKTATA 279 

+AGIKG +AGT LR + ++ PT+AM M+ LG+S D+NG MIP+R+ + QIj+ 
Sbjct: 331 NAG I KG E KAGTALRTMFTNLS S PTRAMGN E ME RLG ISITDSNGKMI PMRKLLD QLREKFK 3 90 

Query: 280 GLTQEERNRHLVTLYGQNSLSGMLALLDAG PEKI^KMTNALVNSDGAAKEMAETMQDNLA 339 

L+++++ T++G+ ++SG LA+++A E K+T ++ +S GA+K MA+TM+ L 

Sbjct: 391 HLSKDQQASSAATIFGKEAMSGA1AIINASDEDYQKLTKSIDSSTGASKRMADTMESGLG 450 l 

Query: 340 SKIEQMGGAFESVAIIVQQII^PAIAKIVGAITKVLEAFVNMSPIGQKMVVIFAGMVAAL 399 

K+ + E +A+ + +EPAL IV A +KV+ + Q W F VA L 

Sbjct: 451 GKLRTLRSQLEELALTIYDRIEPALKIIVSAFSKVVTWVTK^ 510 

Query: 400 GPLLLIAGM - -VMTTIVKLRIAIQFLGPAFMGTMGTIAGVIAIF-- 441 

GPL+ + G+ MT + L I t F IA ++ +F 

Sbjct: 511 GPLVFMFGLFISVMGNAMTVLGPLLINVNKASGLFAFIjRTKIASLVKLFPII/SVSISSLT 570 

Query: 442 YALVAV FMIAYTKSERFRNFINSLAPAIKAGFGGA 476 

ALV + F AY +SE FRN +N + F A 

Sbjct: 571 LPITLIVGALVGIGIAFYQAYKRSETFRNXVNQAISGVANAFKAA 615 

Query= sid| 114824 | lan |dplORP003 Phage dpi ORF| 53538-55877 | 3 
(779 letters) 

>sp|P4374l|DP01_HAEIN DNA POLYMERASE I (POL I) >gi ( 1074025 |pir | | E64098 DNA polymerase I 
(polA) horaolog - Haemophilus influenzae (strain Rd KW20) 
>gi| 1573 871 (U32767) DNA polymerase I (polA) 
(Haemophilus influenzae Rd] 
Length » 930 

Score «• 191 bits (481) , Expect » le-47 

Identities - 148/SS3 (26%), Positives « 262/553 (46%), Gaps « 60/553 (10%) 

Query: 63 RLELITEEAKLEQYVDKMIEDGIGSIDVETTOGIjDTIHDEIiAGVCLYSPSQKGIYAPVMW 122 

+ E + +A L ++++K+ + ++D ETD LD + L G+ + + Y P+ 

Sbjct: 333 IC^ETLLTQADLTRWIEKI^AAKLIAVDTETDSIJ)Y>lSANLVGISFAI i ENGE^YLPU3I^ 392 

Query: 123 SNMTKMRIKNQISPEFMKKMEiQRIVDSGIPVIYHNSKFDMKSIYWRL^ 182 

++ + +K +L+ + I I N KFD +SI+ R G+++ +DT L 
Sbjct: 393 YLDAPKTLEKSTAIiAAIKPILE NPNIHKIGQNIKFD-ESIFARHGIELQGVEFDTML 44 8 

Query: 183 AAMLLNENESHSLKSIJISKYVRNEENAEVAKFNDLFKGIPFSLIPPDV^ 242 

+ LN H++ L +Y+ +E A + + F+ IP + A YAA D T 

Sbjct: 449 L S YTLNS TG RHNMDDLAKR YLGHET I AFES LAGKG KS QLT FNQ I PLE QATE Y AAED AD VT 508 

Query: 243 FELYEFQEQYLTP<3TEOCEEYNLEKVSWVLHNIEMPLIKVLFDMEVYGVDIiDQ 302 

+ L + l E Y +E+PL+ VL ME GV +D D L 

Sbjct: 509 MKLQQALWLKLQEE PTLVELYK TMELPLLHVLSRMERTGVLIDSDALFMQS 559 

Query: 303 EQFTANMNEAEQEFQQLVS EWQPEI EELRQTNFQSYQKLEMDARGRVTVS ISS PTQLAI L 362 

+ + + E++ L + +++S QL + 

Sbjct: 560 NE I AS RLTALEKQAYALAGQ - - - PFNLASTKQLQEI 592 

Query: 363 FYDIMGLKSPERDKPRG - - - TGES I VEH - - FDNDI SXXXXXXXXXXXXVSTYTT- LDQHL 416 

+D + L ++ P+G T E ++E + +++ STYT L Q + 

Sbjct: 593 LFDKLELPVIiQKT-PKGAPSTNEEVLEEI^YSHELPKILvTCHRGLSKLKSTYTDKLP 651 

Query: 417 AKPDNRI HTTFKQYGAKTGRMSS ENPNLQN I PSRGE - GAWRQ I FAAS EGHY 1 1 G S DYS Q 475 

R+HT++ Q TGR+SS +PNLQNIP REG +RQ F A EG+ 1+ +DYSQ 
Sbjct: 652 NSQTGRVHTS YHQAVTATGRLSSSDPNLQNI PIRNEEGRHIRQAFI AREGYS I VAADYSQ 711 

Query: 476 QEPRSIAEI^GDESMRHAYEQNLDLYSVIGSKLYGVTYEECLEFYPDGTTO 535 - 

E R +A LSGD+ + +A+ Q D++ ++++GV +E T+++ R + 
Sbjct: 712 IELRIMAHLSGDQGLINAFSQGKDIHRSTAAEIFGVSLDE VTSEQ RRN 759 



Query: 536 VKS VLLGLMYGRGANS I AEQKNVS VKEANKV I EDFFTEFPKVADYI I FVQQQAQDLGYVQ 595 
K++ GL+YG A ++ Q+ +S +A K ++ +F +P V ++ ++++A+ GYV+ 
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Sbjct: 7S0 AKAINFGLIYGMSAFGI^RQLGISRADAQKYMDLYFQRYPSVQQFMTDIREKAKAQGYVE 819 

Query: 596 TATGRRRRLPDMS 608 

T GRR LPD++ 
Sbjct: 820 TLFGRRLYLPDIN 832 

Score » 46.9 bits (109), Expect = Se-04 

Identities = 34/123 (27%), Positives « 66/123 (53%) , Gaps = 16/123 (13%) 

Query: 663 EIKDQAKAEGI LIKDNGGKIADAQRQCLNSVTQGTAADMTKYAMIKV 709 

+I+++AKA+G + N + A+R +N+ +QGTAAD+ K AMIK+ 

Sbjct: 807 DIREKAKAQGYVETLFGRRLYLPDINSSNAMRRKGAERVAINAPMQGTAADIIKRAMIKL 866 

Query: 710 HNDAELKELGFHI^IPVHDELLGEVPIKNAKRGA£RLTETWIEAAKDIISL 769 

++ + +++ VHDEIrf EV + E++ + M EAA +++ +P+ + + 

Sbjct: 867 -DEVIRHDPDIEMIMQVHDELVFEVRSEKVAFFREQIKQHM-EAAAELV-VPLIVEVGVG 923 

Query: 770 ERW 772 
+ W 

Sbjct: 924 QNW 926 

Query= sid | 114825 | lan |dplORF004 Phage dpi ORF | 40401-42440 | 3 
(679 letters) 

>enb| CAB07981 | (Z93946) hypothetical protein [bacteriophage Dp-1) 
Length o 532 

Score = 1011 bits (2S8S) . Expect ■ 0.0 

Identities = 497/499 (99%), Positives = 498/499 (99%) 

Query: 1 MTKFINSYGPLHLNLYVEQVSQDVTNNS S RVSWRATVDRDGAYRTWTYGNISNLS VWLNG 60 

MTKFINSYGPLHUJLYVEQVSQDVTNNS S RVSWRATVDRDGAYRTWTYGNISNLSVWLNG 
Sbjct: 1 MTKF INS YG P LHLNLYVEQVS QDVTNNS S R VS WRATVD RDGAYRTWTYGN I SNLSVWLNG 60 

Query: 61 SSVHSSHPDYDTSGEEVTLASGEVTVPHNSDGTKTMSWASFT)PNNGV^ 120 

SSV^SSHPDVDTSGEEWLASGEVTVPHNSDGTKTMSVWASFDPNNGVHGNITISTNYTL 
Sbjct: 61 SSVHSSH PD YDTSGEEVTLASGEVTVPHNSDGTKTMS VWAS FDPN^ 120 

Query: 121 DSIPRSTQISSFEGNRNIASSIiHTVIFNRKVNSFTHQVWYRVFGSDW 180 

DSIPRSTQISS FEGNRN LGS LHTVT FNRKVNS FTHQVWYRVFGSDWIDLGKNHTTSVS FT 
Sbjct: 121 DS I PRSTQI SSFEGNRNLGSLHTVI FNRKVNS FTHQVWYRVFGSDWIDLGKNHTTSVS FT 180 

Query: 181 PSLDLARYLPKSSSGTMDICXRTYNGTTQIGSDVYSNGWRE^IPDSVRf TFSGISLV13TT 240 

PSLDLARYLPKSSSGTNDICIRTYNGTTQIGSDVYSNGWRFNIPDSVRPTFSGISLVDTT 
Sbjct: 181 PSLDIJ^YLPKSSSGTMDICIRTYNGTTQIGSDVYSNGWRFNIPDSVRPTFSGISLVDTT 240 

Query: 241 SAVRQ I LTGNN FLQI MSNIQVNFNNASGAYG ST I QAFHAELVGKNQAINENGGKLGMMNF 300 

S AVRQ I LTGNN FLQI MSNIQVNFNNASGAYGST I QAFHAELVGKNQAINENGG KLGMMNF 
Sbjct: 241 SAVRQILTGNNFLQIMSNIQWFNNASGAYGSTIQAFHAELVGKNQA1NENGGK1X5MMNF 300 

Query: 301 NGSATVRAWVTOTRGKQSIIVQDVSINVIEYYGPSINFSVQRTRQNPAIIQALRNAKVAPI 360 

NGSATVRAWVTDTRGKQSNVQDVSINVIEYYGPSINFSVQRTRQNPAIIQAUUIAKVAPI 
Sbjct: 301 NGSATVlUkWvTDTRGKQSNVQDVSINVIEYYGPSINFSVQRTRQNPAIIQAljRNAKVAPI 360 

Query: 361 TVGGQ^KNIMQITFSVAPLNTTNFTEDRGSASGTFTTISLMTNSSANLA^ 420 

TVtJGC^KNIKQITFSVAPLNTTNFTEDRGSASGTFTTISL+TNSSANLAGNYGPDKSYIV 
Sbjct: 361 TVGGQQKNIMQITFSVAPLNTTNFTEDRGSASGTFTTISLLTNSSANLAGNYGPDKSY1V 420 

Query: 421 KAKIQDRFTSTEFSATVATESVVLNYDKDGRLGVGKVVEOCKAGSIDAAGDIYAGGRQVQ 480 

KAKIQDRFTSTEFSATV TES VVLNYDKDGRLGVGKVVEQGKAGS I DAAGDI YAGGRQ VQ 
Sbjct: 421 KAKIQDRFTSTEFSAWPTESVVLNYDKDGRI/SVGKVVEQGKAGSIDAAGDIYAGGRQVQ 4 80 

Query: 481 Q FQLTDNNGALNRGQ YND V 499 

QFQLTDNNGALNRGQYNDV 
Sbjct: 481 QFQLTDNNGALNRGQYNDV 499 



Query* sid| 114827 | lan |dplORF0 06 Phage dpi ORF | 45296 -46987 | 2 
(563 letters) 

>gb|AAD18987| (AE001666) SWI/SNF family helicase_2 [Chlamydia pneumoniae] 
Length m 1166 

Score « 171 bits (429), Expect = le-41 

Identities = 150/522 (28%) , Positives » 254/522 (47%) , Gaps « 55/522 (10%) 
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Query: 


46 


Sbjct: 


659 


Query : 


105 


Sbjct: 


7X0 


Query: 


163 


Sbjct: 


767 


Query: 


222 


SbjCt: 


813 


Query: 


282 


Sb j ct : 


870 


Query: 


336 


Sb j Ct : 


925 


Query: 


389 


Sbjct: 


985 


Query: 


445 


Sbjct: 


1045 


Query: 


505 


Sbjct: 


1105 
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SSNNFE-LPYKYFNNVIDALDEWELHIFGELDKDVQDYIDSRNRIASSSNEQFSFKTTPF 104 
S + FE LP + +++LE+ I GE++ D QD + T 

SLDQPEALPVNF--SMSERLIEIQKQIRGEIEFDFQD VPQQI QATLRS YQT EG 709 

AHQVECFEYAQEHPCFLLGDEQGLGKTKQAIDIAVSRKASFKH- -CLIVCCISGLKWNWA 162 

H +E + H +L D+ GLGKT QAI IAV++ K C ++ C + L +NW 

VHWLE - - RLRKMHLNGI LADDMGLGKTLQAI - IAVTQSKLEKGSGCSLIVCPTSLVYNWK 766 

KEVGIHSNESAHILGSRVTKDGKLVIDGV-SKRAEDLLGGHDEFFLITNIETLRDAVFIK 221 
+E + E LVIDGV S+R + L D IT+ L+ V 

EEFRKFNPEFR TLVIDGVPSQRRKQLTALADRDVAITSYNLLQKDV- - - 812 

YLNELTKSGEIGMVI IDE IHKCKNP SS KQGAS I QKLQS YYKKGLTGTPLMNHPI 0VFNVM 281 

EI* KS V++DE H KN +++ S++ +QS +++ LTGTP+ N+ +++++ 

- - -ELYKSFRFDYVVLOEAHHIKNRTTRNAKSVKMIQSDHRLILTGTPIENSLEELWSLF 869 

KWLGAEHHTLTQFKERYCIVDQFNQITGYR NLAELRELVNDYMLRRTKEEVL-DL 335 

+L L +R+ V ++ + Y N+ L++ V+ ++LRR KE+VL DL 

DFI^PG---LLSSYDRF--VGKYIRTGNYMGNKADNWALKi^^ 924 

PEKIRVTEYVDMNSKQSKIY KEVLTKLVQEI DKVKLMPNPLAETI RLRQATGN 388 

P + + + Q ++Y K+ L++LV++ ++ + LA RL+Q + 

PPVS E I LYHCHLTESQKEIiYQS YAAS AKQELSRLVKQEG FERIHI HVIiATLTRUCQ I CCH 984 

PSILTTQDVK- - -SCKFERCIEIVEECIQQGKSCVIFSNWEKVIEPLAKIL-SICrVKCNL 444 

P+I + S K++ ++++ + G V+FS + K++ + K L S+ + 

P AI FAKDAPEPGDSAKYDMLMDLLS S LVDSGHKTWFSQYT KMLGI I KKDLESRGI PFVY 1044 

VTGETADKFNE I E EFMNHRKAS VI LGTIGALGTG FTLTKADTVI FIDS PHTRAEKDQAED 5 04 
+ G T ++ + + +F V L ++ A GTG L ADTVI D W A ++QA D 



RCKRIGAKSSVTIYTLVAXGTVDERIEOLIERKGELADYIVD 546 
R HRIG SV+ Y LV T++E+I h RK L +++ 



Query= aid | 114828 | Ian |dplORF007 Phage dpi ORF| 22230-23621 j 3 
(463 letters) 

>gi| 2444105 (U88974) ORF26 [Streptococcus thermophilic temperate bacteriophage 
O1205] 

Length a 411 
Score * 88.9 bits (217), Expect = 7e-17 

Identities » 80/315 (2S%) , Positives = 133/315 (41%), Gaps » 48/315 (15%) 

Query: 139 QG VTLAG I FCDE VALM PES FVNQ ATGRCS VTG S KMW F S CN PAN PNHY FKKNW I D KQVE KR 198 

+G T G + +E +L E + RCS G+++ + NP NPNH+ +++I K + + 
Sbjct: 121 RGFTAFGAYVNEASIJVNELVFKEIISRCSGIX3ARVVTO 179 

Query: 199 ILYLHFTMDDNPSLT DS I KRRYEKMYAGVFRKRF I LGLWVTADGLVYSMFNEEQHV 254 

1+ F +DDN L+ DSIK K G F R ILGLW A+G +Y+ ++ + HV 
Sbjct: 180 IIDFSFKLDDNTFLSKRYIDSIKAATPK GKFYDRDILGL.WTVAEGAIYADYDSKIHV 236 

Query: 255 KKLNIEFDRLFVAGDFGIYNATTFGLYGFSKRHKRYHLIESYYHSGREAEEQLTEADVNS 314 

E R F D+G + + + G ++L++ +E + + +A 
Sbjct: 237 VDELPEMKRYFGGIDWGYTHYGS I VI VG - EGVDNNFYLVDG VAAQFKE IDWWVEQA 291 

Query: 315 NIQFSSVI^KTTKEYANDLVDMIRGKQIEYIILDPSASAMIVELQKHPYIAR KNIPI 371 

+K T Y N + + ++AR + I 

Sbjct: 292 RKLTGIYGN I PFYADSARPEHVARFENEGFDI 323 

Query: 372 IPARNDVTIX3ISFHAELIAENRFTLDPSNT-HDIDEYYAYSWDSKASQTGEDRVIFCEHDH 430 

+ A V GI A+L E + + DE Y Y W ++ +D +KE D 

Sbjct: 324 MNANKSVIAGIELIAKLFKEKKLYVKRGFVPRFFDEIYQYRWKENST- - -KDEPLKEFDD 380 

Query: 431 CMDRNRYACLTDALI 445 
+D RYA +D +1 

Sbjct: 381 VLDSVRYAIYSDYVI 395 ' 



Query* sid | 114829 j lan |dplORF008 Phage dpi ORF| 49624-50961 | 1 
(445 letters) 



>gb| AAD19901 | (AF100420) DnaB replication fork helicase [Thermus aquaticus) 
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Length « 444 
Score => 67.5 bita (162), Expect => 2e-10 

Identities = 69/248 (27%), Positives ■ 111/248 (43%), Gaps = 14/248 (5%) 

Query: 147 GERLGISTGFEXXXXXXXXXXXXXXXIVIMARPGQGKS-WTIDKMLATAWKNGHDVLLYS 205 

GE G+ TGF+ I I ARP GK+ + + A K G V +YS 

Sbjct: 178 G EVAGVRTG FKE LDQLI GTLG PGSLN I - 1 AAR PAMG 1CTAFALT I AQNAALKEGVGVG I YS 236 

Query: 206 GEMSEMQVGARIDTILSNVSINSITKGIWNDHQFEKYEDHIQAMTEAENSLVVVTPFMIG 265 

EM Q+ R+ + + +N + G D F + D ++EA + TP + 
Sbjct: 237 LEMPAAQLTLRMMCSEARIDMNRVRLGQLTDRDFSRLVDVASRLSEAP-IYIDDTPDLTL 295 

Query: 266 GKNLTPAILDSMISKYRPSWGIDQLSLMS--ESYPSREQKRIQYANITMDLYKISAKYG 323 

+ A ++S+ + ++ ID L LMS S S E ++ + A 1+ L ++ + G 
Sbjct: 296 ME - - VRARARRLVSQNQVGLI I IDYXiQLMSG PGSGKSGENRQQEI AAISRGLKALARELG 353 

Query: 324 IPIVLNVQAGRSAKTEGAESMELEHIAESDGVGQNASRVIAMKRD EKSGILEL 376 

IPI+ Q R+ + + L + ES + Q+A V+ + RD EK+GI E+ 

Sbjct: 354 IPIIALSQLSRAVEARPNKRPMLSDLRESGSIEQDADLVMFIYRDEYYNPHSEKAGIAEI 413 

Query: 377 SWKNRYG 384 

V K R G 
Sbjct: 414 IVGKQRNG 421 



Query= aid | 114831 | lan | dplORFOlO Phage dpi ORF| 8699-9859 | 2 
(386 letters) 

>gi| 2760912 (AF037258) RecA protein [Chlorobium tepidum] 
Length » 346 

Score = 133 bits (331) , Expect * 2e-30 

Identities « 99/340 (29%), Positives = 164/340 (48%), Gaps « 66/340 (19%) 

Query: 44 GGLPRKRVV^FFGPESSGKTTSALDIVKNAQMVFXXXXXXXXXXXXXXXXNARASKASKT 103 

GGLPR RV E +GPESSGKTT AL + AQ 
Sbjct: 67 GGLPRGRVTE I YGPESSGKTTLAIiHAXAEAQ KNG 100 

Query: 104 AVTCELEMQIjDSIiQEPLKIVYIJ3LENTLDT 163 

+ L +D E+ D +A+K+GVD++ + + +PE S E+ L V 

Sbjct: 101 GIAAL VDAEHAFDPTYARKLGVD I NALLVSQ P E - - SGEQALS I VE 143 

Query: 164 DIFETGEVGLVVLDSLPYMVSQNLIDEELTKKAYAGISAPLTEFSRKVTPI.LTRYNAIFL 223 

+ +G V ++V+DS+ +V Q ++ E+ + +++ RK+T +++ +++ L 

Sbjct: 144 TLVRSGAVDIIVIDSVAALVPQAELEGEMGDSVVGLQARLMSQALRKLTGAISKSSSVCL 203 

Query: 224 GINQIREDMNSQYNA- YSTPGGKMWKHACAVRLKFRKGDYLDENGASLTRTARNPAGNVV 282 

IKQ+R+ + Y + +T GGK K +VRL RK + ++G L GN 
Sbjct: 204 FINQIiRDKIGVMYGSPETTTGGKALKFYSSVRLDIRKIAQI-KDGEELV GNRT 255 

Query: 283 ESFVEKTKAFKPDRKLVSYTLSYHDGIQIENDLVDVAVEFGVIQKAGAWFSIVDLETGEI 342 

+ VKK PK + + Y +GI + +L+D+AVEFG+I+K+GAWFS + G 
Sbjct: 256 KVXVVKNKV-APPFKTAEFDILYGEGISVliGELIDIAVEFGIIKKSGAWFSYGTEKI/3-- 312 

Query: 343 MTDEDEEPLKFQGKANLVRRFKEDDYLFDMVMTAVHEIIT 382 

QG+ N+ + KED+ L + + V +++T 
Sbjct: 313 QGRENVKKLLKEDETLRNTI RQQVRDMLT 341 



Query= aid | 114832 | lan [dplORFOll Phage dpi ORF| 28017-29096 j 3 
(359 letters) 

>gi | 2444110 (U88974) ORF31 [Streptococcus thermophilus temperate bacteriophage 
01205] 

Length = 348 
Score = 187 bits (469) , Expect » le-46 

Identities ■ 118/358 (32%), Positives = 187/358 (51%), Gaps = 21/358 (5%) 

Query: 3 I YD YINAGE I AS YI QALPSNALQYLG PTL FPNAQQTGTD I SWLKGANNL PVT I Q PSNYDA 62 

I YD + A IA Y AL N LG ++FP +Q GT +S++KGA+ V ++ + +D 
Sbjct: 4 IYDKVTASNIAGYFNALQENVSSTLGESIFPARKQIiGTKLSYIKGASGQSVALKAAAFDT 63 

Query: 63 KASLRERAGFSKQATEMAFFRESMRLGEKDRQNLQMLLNQSSA-LAQPLITQLYNDTKNL 121 
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++R+R +M FF+E+M + E DRQ L ++ + +A L ++ ++ND L 

Sbjct: 64 NVTIRDRVSAEMHDEQMPFFKEAMIjVKENDRQQLNLVKDSGNAVLVNTIVAGIFNDNLTL 123 

Query: 122 VDGVEAQAE YMRMOLLQYGKFTVXSTNSEAQYTYDYN>TO 181 

V+G A+ E MRMQ+L GK S YD K+Q V+K W P + P+A 

Sbjct: 124 WGARARLEAKRMQVLATGKIAFTSDGVNKDIDYGVKPDHKKQ- -VSKSWAEPG- ATPLA 180 

Query: 182 DILAAMDDIENRTGWPTRMVLNRNTYNQMTKSDSIKKAL-AIGVG^SWENF 240 

D+ A+ + G+ P R V+N T+ + K+ S K + + GS + ++ E 

Sbjct: 181 DLEDAI-ETAREliGLNPERAVMNAKTFGLIRKAASTVKVIKPLAGDGS AVTKAELE 235 

Query: 241 KFIAEKTGLQIAVYSKKIAQFADADKLPDVGNIRQFNLIDDGFCVVLLPPDAVGHTWYGTT 300 

+IA+ G+I + + DG + +F DG + L+P +G+T +GTT 

Sbjct: 236 NY I ADNFGVS I VLENGTYRN DKGEVSKF - - YPDGHLTLI PNG P LGNTVFGTT 285 

Query: 301 PEAFDLASGGT - DAQVQVLSGGPTVTTYLEKHPVNI ATVVSAVMI PSFEGIDYVGVLT 357 

PE DL + T +A+V+++ G VTT PVN+ T VS V +PSFE +D V +LT 

Sbjct: 286 PEES DL F ADNTVNAEVE IVDNGI AVTTT KTTD P VNVQTKVS MVA LPS FE R LDD VYMLT 343 



Querys sid| 114834 | lan JdplORFO 13 Phage dpi ORF| 10215-11240 | 3 
(341 letters) 

>sp|P09122|DP3X_BACSU DNA POLYMERASE III SUBUNITS GAMMA AND TAU 
Length « 563 

Score = 1B2 bits (458) , Expect = 2e-45 

Identities » 118/353 {33%), Positives » 176/353 (49%), Gaps = 31/353 (8%) 

Query: 7 Y R PQT F EEWAQ E YVKE I LLNQ LQNG AI KHGYL FCXXXXXXXXXXXRI F AKD VN - 60 

+RPQ FB+W QE++ + L N L H YLF +IFAK VN 

Sbjct: 10 FRPQRFEDVVGQraiTKTLQNALLQKKFSHAYLFSGPRGTGKTSAAKIFAKAVNCEHAPV 69 

Query: 61 KGL GSPIEIDAASNNGVENVRNIIEDSRYKSMDSEFKVYTIDEVH 10S 

KG+ I E IDAASNNGV+ +R+I + ++ +KVYIIDEVH 

Sbjct: 70 DEPCNECAACKGITNGSISDVIEIDAASNNGVDEIRDIRDKVKFAPSAVTYKVYIIDEVH 129 

Query: 106 MLSTGAFNALLKTLEEPSSGTVTILCTTDPQKIPDTILSRVQRTO 165 

MLS GAFNALLKTLEEP +FIL TT+P KIP TI+SR QRFDF RI + IV ++ 
Sbjct: 130 MLSIGAFNALLKTLEEPPBHCIFILATTEPHKIPLTIISRCQRFDFKRITSQAIVGRMNK 189 

Query: 166 IIESENEEGAGYSYERDAI^FIGKIJ^GGMRDSITRI*EKVIJ)YSHHVDMEAVSNAL- --G 222 

I+++E E +L I A+GGMRD+++ L++ + +S D+ V +AL G 

Sbjct: 190 IVDAEQ LQVEEGS LE 1 1 AS AAHGGMRDALS LLDQAI S FSG - - D I LKVEDALLI TG 242 

Query: 223 V PD YET FAS LVEA I ANYDG S KCLE I VND FHYSG KD LKLVTRN FTD F LLEVCKYWL VRD I S 282 

L +++ + + S LE +N+ GKD + + + ++ Y + 
Sbjct: 243 AVSQLYIGKLAKSLHDKNVSDALETI^ELIXK^KDPAKLIEDMIFYFRDMLLYKTAPGLE 302 

Query: 283 ITQLPAHFESKLEQFCEAFQYPTLLWMLEEMNELAGVVKWEPNAKPIIETKLL 335 

+ + E L M++ +N+ +KW + + E ++ 

Sbjct: 303 GVLEKVKVDETFRELSEQI PAQALYEMIDI LNKSHQEMKWTNHPRI FFEVAW 35S 



Query= sid| 114835 | lan jdplORF014 Phage dpi ORFj 50961-51974 | 3 
(337 letters) 

>splP47492|PRIM_MYCGE DNA PRIMASE >gi | 1361496 | pir | | F64227 DNA primase (dnaE) homolog 
MG250 - Mycoplasma genitalium (SGC3) >gi| 3844848 
(U39704) DNA primase (dnaE) [Mycoplasma genitalium) 
Length = 607 

Score = 57.0 bits (135), Expect » 2e-07 

Identities * 53/190 (27%), Positives = 89/190 (45%), Gaps * 17/190 (8%) 

Query: 146 EELDKYRFIHP YMYERKLTDELIEMFDVGYDK- -LHDCITFPVRNLKGETVFF 196 

E +++Y FI+P Y++ K ++FD K +IP++GVF 

Sbjct: 170 ESMERYPFINPKIKPSELYLFS - KTNQQGLGFFDFNTKKATFQNQIMI PIHDFNGNPVGF 228 

Query: 197 NRRSVRSKFHQYGEDDPKTEFLYGQYELVAFRDYFEKPISQVFVTESVINCLTLWSMKIP 256 - 

+ RSV + ++ EF + + EL+ K ++Q+F+ E + TL + K _ 

Sbjct: 229 SARSVDNINKLKYKNSADHEF-FKKGELLFNFHRLNKNLNQLFIVEGYFDVFTLTNSKFE 287 " 

Query: 257 AVALMGVGGGN- QINLLKR - - LP Y1WIVLALDPDNAGQTAQEKLYRQLKRSK- VVRFLNY 312 

AVALMG+ + QI +K + +VLALD D +GQ A L +L + +V + + 

Sbjct: 288 A VALMG LALNDVQ I KA I KAH F KE LQT LVLALDNDASGQNA VF S L I E KLNNNNF I VE I VQW 347 
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Query: 313 PKEFYDNKWD 322 

+ D WD 
Sbjct: 348 EHNYKD--HD 355 



Ouery= sid| 114837 | lan |dplORF016 Phage dpi ORF (43413 -44303 | 3 
(296 letters) 

>emb|CAB07986 | (Z93946) N-acetylmuramoyl-L-alanine amidase (bacteriophage Dp-lJ 
Length - 296 

Score = 661 bits (1686), Expect =0.0 

Identities » 296/296 (100%> . Positives = 296/296 (100%) 

Query; 1 MGVDIEKGVAWMQARKGRVSYSMDFRDGPDSY^ 60 

KGVDI EKG VAWKQARK6RVSYSMDF11DGPDS YDCS S SMVYALRS AGASSAGHAVin'E YMH 
Sbjct: 1 MGVDIEKGVAWMQARKGRVSYShTOFRTOPDSYDCSSSMYYALRSAGASSAGWAVNTEYMH 60 

Query: 61 AWLIENGYELISENAPWDAKRGDIPIWGRKGASAGAGGHTGMPIDSDNIIHC3TYAYDGIS 120 

AW LI ENGYEL I S ENA P WD AKRGD I F I WG RKG AS AGAGGHTGM PIDSDNIIH CNY A YDG I S 
Sbjct: 61 AWLIENGYELISENAPWDAKRGDIFIWGRKGASAGAGGHTGMPIDSDNIIHCNYAYDGIS 120 

Query: 121 VNDHDERWYYAGQPYYYVYRLTNANAQPAEKKLGWQKDATGFWYARANGTYPKDEFEYIE 180 

VNDHDERWYYAGQ PYYYVYRLTNANAQPAEKKLGWQKD ATG FWYARANGTYPKD E PE YI E 
Sbjct: 121 VNDHDERWYYAGQ PYYYVYRLTNANAQPAEKKLGWQKDATGFWYARANGTYPKDEFEYIE 180 

Query: 181 ENKSWFYFDDQGYMLAEKWLKHTDGNWYWFD 240 

ENKSWFYTDDQGYMIJ^K>^LKHTDGNWYWFDRDGYMATSWKR 
Sbjct: 181 ENKSWFYFDDC^YMLAEKWUCHTDGNWYWFDRDGYMATSWKRIGESWYYTO 240 

Query: 241 IKYYDNWYYCDATNGDMKSNAFIRYNEM3WYLLLPDGRLADKPQFTVEP 296 

I KYYDNWYYCDATNGDMKSNAPIRYNDGWYIJXPDGRIJU)KPQPTVEPDGLITAKV 
Sbjct: 241 IKYYDNWYYCDATNGDMKSNAFIRYNDGWYLLLPDGRLADKPQFTVEPDGLITAKV 296 

Query* sid| 114 841 | lan|dplORF020 Phage dpi ORF | 1864-2658 | 1 
(264 letters) 

>erab| CAB13247 | (299111) similar to coenzyme PQQ synthesis [Bacillus subtilis] 
Length « 243 

Score = 217 bits (548) , Expect » 5e-56 

Identities = 117/248 (47%), Positives = 163/248 (65%), Gaps = 15/248 (6%) 

Query: 23 MPIMEIFGPTIQGEGMVIGQKTIFIRTGGCDYHCNWCDSAFTWNGTTEPE- -YITGKEAA 80 

+P++EIFGPTIQGEGMVIGQKT+F+RT GCDY C +WCDS AFTW+G + + + ++T +E 
Sbjct: 5 IPVLEIFGPTIO^EGMVIGQKTMFVRTAGCBYSCSWCDSAFTWDGSAKKDIRWMTAEEIF 64 

Query: 81 S RI LKLAFNDKGEQ I CNHVTLTGGN PALINE PMAKM I S I LKEHGFKFGLETQGTRFQEWF 140 

+ + D G +HVT++GGNPAL+ + + I +LKE+ + LETQGT +Q+WF 

Sbjct: 65 AEL KD I GGDAFSHVTI SGGN PALLKQ - LDAP I ELLKENNIRAALETQGTVYQD WF 118 

Query: 141 KEVSDITISPKPPSSGKRTNMKILEAIVDRM--NDENLDWSFKIVIFDENDLAYARDMFK 198 

+ D+TISPKPPSS MTN + L+I+ + ND S K+VIF++ DL +A+ + K 

Sbjct: 119 TLIDDLTISPKPPSSKMVTNFQKLDHILTSI^Em)RQHAVSLKWIFNDEDLEFAKTVHK 178 

Query: 199 TFEGKLRPVNYLSVGNANAY- - EEGKISDRLLEKLGWLWDKVYEDPAFNNVRPLPQLHTL 256 

+ G YL VGN + + ++ + LL K L DKV D N VR LPQLHTL 

Sbjct: 179 RYPG- - - IPFYLQVGNDDVHTTDDQSLIAHLLGKYEALVDKVAVDAELNLVRVLPQLKTL 23 S 



Query: 257 VYDNKRGV 264 

++ NKRGV 
Sbjct: 236 LWGNKRGV 243 



WO 00/32825 



PCT/IB99/02040 



406 

Query* sid| 114 842 | lan|dplORF021 Phage dpi ORF| 2504-3295 | 2 
(263 letters) 

>sp|P19465|GCHl_BACSU GTP CYCLOHYDROLASE I (GTP-CH-I) >gi j 98411 | pir( |A38256 GTP 
cyclohydrolase I (EC 3.5.4.16) - Bacillus subtil is 
>gi | 143231 (M37320) regulatory protein (Bacillus 
subtilisl >gi | 143799 (M80245) HtrA (Bacillus subtilis] 
>gi|2634696|emb|CAB14194| (Z9911S) GTP cyclohydrolase I 
(Bacillus subtilis] 
Length = 190 

Score 3 208 bits (523) , Expect = 4e-S3 

Identities = 103/185 (5S%) , Positives * 133/185 (71%) , Gaps » 1/185 (0%) 

Query: 80 VTLDNTEAAVQRLFGLLGEDAERDGLQDTPFRTVKA 139 

V + E AV+++ +GED R+GL DTP R K AE G EDPK H + F +H 
Sbjct: 4 WKEQIEQAWQILEAIGEDPNREGIJiDTPKRVAKMYAEVFSGLNBDPKEHFQTIFGENH 63 

Query: 140 EDLVLVKDIPFNSUTEHHIAPFVGKVHIAYIPKD-KITGLSKFGRVVEGYAKRLQVQERL 198 

E+LVLVKDI F+S+CEHHL PF GK H+AYIP+ K+TGLSK R VE AKR Q+QER+ 
Sbjct: 64 EE L VLVKD I AFHS MC EHHLVP FYGKAHVAY I P RGGKVTG LS KIARA VEA VAKRPQLQE R I 123 

Query: 199 TQQI ADAI QEVLN PQAVAVI VEAEHTCMSGRG I KKHGATTVTSTMRGLFQDDASARAELL 258 

T IA++I E L+P V V+VEABH CM+ RG++K GA TVTS +RG+F+DDA+ARAE+L 
Sbjct: 124 TSTIAESIVETLDPHGm7VVEAEHM(>nmGVRKPGAKTVTSA 183 

Query: 259 QLIKK 263 
+ IK+ 

Sbjct: 184 EHIKR 188 



Query- sid 1 114843 1 lan |dplORF022 Phage dpi ORF J 30896-31675 12 
(259 letters) 

>gi 1 2347102 (U77367) intemalin (Listeria monocytogenes J 
Length = 821 

Score =55.0 bits (130), Expect - Se-07 

Identities = 44/149 (29%), Positives = 63/149 (41%), Gaps = 13/149 (8%) 

Query: 119 FRMNIWPNYVG--DSIVNYVKITLNNCTGKAPGLSIGKEFYAPEFNIICAREATKAGLPV 176 

F + VPN + D + + NNTAPL YPE+K +K + 

Sbjct: 383 FS KTLS VPNN ITS I DGTLI APETI SNNGTYDAPNLKWS LPNYL PE - - VKYTFSQKI P IGT 440 

Query: 177 KSMDYVAQLPAVLR RVTFDLNGGTGTADAVRVEAGKKI S P KPVDPTLTGKAFKGW 231 

+ +Y + L+ +VTF++ G T + V E + P+P PT G F GW 
Sbjct: 441 GTSNYSGFITQPLKELIiDYKVTFNVEGNTSEVETVTEE NLI PEPTS PTKQGYTFDGW 497 

Query: 232 -KVEGESTIWDFDNHMMPDRDVKLVAQFA 259 

E T WDF MP D+ L A F+ 
Sbjct: 498 YD AETGGTKWD FTTG QM P AND LTL YAH F S 526 



Score » 43.4 bits (100), Expect 0.002 

Identities « 47/195 (24%) , Positives « 73/195 (37%) , Gaps * 12/195 (6%) 



Query: 


72 


YDLTFKDNTET)PEIMALIEGGTVRQQGGTIAGYDT-PMLAQGASNMKPFRWNIYVPNY-- 


128 






YD +T+ +G +GG +TMA+ F+NYN+ 




Sbjct: 


547 


YDALLNEPTTPTKQGYTFDGWYD AETGGNKWD FKTMKMPANDVAFYAHFTINNYQANFD I 


606 


Query: 


129 


- - -VGDSIVNYVKITI^CTGKAPGLSIGKEFYAPEFNIKAREATKAGLPVK5MDYVAQL 


185 






V++Y+ T G+ + A K TK+P + A 




Sbjct: 


607 


DGEVKNETIAYDTLLNEPTTPTKQGYTFDGWYDAETGGTmDFKTKE-MPANDVTLYAHF 


665 


Query: 


186 


PAVUlRVTFDr^GGTGTADAVRVEAGKKISPKPVDPTLTGKAFKGW-KVEGESTIWDFDN 


244 






+ FD++G T + V +A + P+P P+ TG +GW E T WDF 




Sbjct : 


666 


TINNYQANFDIDGAV-TEEWNYDA LIPEPTSPSKTGFTLEGWYDAEVGGTKWDFKT 


721 


Query: 


245 


HMMPDRDVKLVAQFA 259 








MP D+ L A F+ 




Sbjct : 


722 


MKMPAND I TLYAHFS 736 





Score = 38.3 bits (87), Expect 0.057 

Identities = 42/169 (24%). Positives « 59/169 (34%), Gaps « 10/169 (5%) 
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Query: 96 QQGGTIAGYDT-PMLAQGASNMKPPRMNIYVPHYVGDSIVNYVKIT LNNCTGKAPG ISO 

+ GGT + T MA + F+NYN+D+V + LNT 
Sbjct : 501 ETGGTKTOFTTGQMPANDLTLYAHFSVNSYQANFDIIX^ 560 

Query: 151 LS I GKE FYAPEFNI KARBATKAGL P VTCSMDYVAQLPAVLRR VTFD LNGGTGTADAVR VEA 210 

+Y E + +P + + A + FD++G A 
SbjCt: 561 GYTFDGWTOAETGGNKWDFCTMKMPANDVAFYAHFTI A 616 

Query: 211 GKKISPKPVDPTLTGKAPKGW-KVEGESTIWDFDNHMMPDRDVKLVAQF 258 

+ +P PT G F GW E T WDF MP DV LAP 
Sbjct: 617 YDTLXirc PTTPTKGXJYTFDGWYDAETGGTKWDFKTKEMPANDVTLYAHF 665 



Query= sid| 114850 1 lan |dplORF029 Phage dpi ORF| 662-1346 | 2 
(228 letters) 

>gi | 2650185 (AE001074) succinoglycan biosynthesis regulator (exsB) 
[Archaeoglobus fulgidus) 
Length = 239 

Score » 119 bits (295), Expect » 2e-2fi 

Identities « 79/224 (35%), Positives = 113/224 (50%), Gaps » 11/224 (4%) 

Query: 1 MKSVVLLSGGVDSATCLAIEVDKWGSKNVHA 60 

MK+V+LLSGG+DS+T L +D G VHA+ F YGQKH E+E+A VA V+ 
Sbjct: 1 MKAVMLI^GGIDSSTLLYYLIJJ- -GGYEVHALTFFYGQKHSKEIESAEKVAKAAKVRHLK 58 

Query: 61 LEIDSKIYXXXXXXIJ^KGEISHGKSYAEIIJ^ 120 

++I S 1+ L G+ E+ Y+E + + T VP RN ++LS 
Sbjct: 59 VDI-STIHDLISYGALTGEBEVPKA-FYSEEVQRR TI VPNRNMILLS - - IAAGYAV 110 

Query: 121 XXXXXXXXXXXXXXXXXXXPIXrTPEFYNSMSNAMEYGT-GGKVTLVAPLLTLTKAQVVKW 179 

PDC EF ++ A+ V + AP + +TKA +V+ 

Sbjct: 111 KIGAKEVHYAAHLSDYSIYPDCRKEFVKALDTAVYLANIWTPV^ 170 

Query: 180 GIDLDVPYFLTRSCYESDAESCGTCATCIDRKKAFEENGMTDPI 223 

G+ L VPY LT SCYE C +C TC++R +AF NG+ DP+ 

Sbjct: 171 GLKLGVPYELTWSCYEGGDRPCLSCGTCLERTEAFLANGVKDPL 214 



Query* aid) 114 855 | lan|dplORF034 Phage dpi ORF| 131-652 | 2 
(173 letters) 

>emb|CAB13248| (Z99111) similar to hypothetical proteins [Bacillus subtilisj 
Length = 165 

Score ■ 220 bits (556) , Expect « 4e-57 

Identities 103/139 (74%), Positives » 117/139 (84%) 

Query: 5 TTRTDAELTGVTLLGNQDTKYDYDYNPDVLETFPN 64 

TTR ++BL GVTLLGNQ T Y ++Y PD VLB + FPNKH +Y V F+ EFTSLCPKTGQ 
Sbjct: 2 TTRKESELEGVTLU3NQGTNYLFEYAPDVLESFPNKHVNRDYFVKFNCPEFTSLCPKTGQ 61 

Query: 65 PDFANVFISYIPNEKMVESKSLKLYLFSFRNHGDFHEDCMNIIIiNDLYELMEPKYIEVMG 124 

PDFA + + 1 S YI P +E KMVESKS LKL YLFS FRNHGDFHBDCMN I 1 +NDL ELM+P+YIEV G 
SbjCt: 62 PDFATIYISYIPDEKMV^SKSLKLYLFSFRNHGDFHEDCMNIIMNDLIELMDPRYIEVWG 121 

Query: 125 LFTPRGGISIYPFVNKVNP 143 

FTPRGGISI P+ N P 
Sbjct: 122 KFTPRGGISIDPYTNYGKP 140 



Query* sidj 114 857 | lan | dplORFO 36 Phage dpi ORF| 4 8808-49362 | 1 
(184 letters) 

>gi|l353S29 (U38906) ORF12 {Bacteriophage rlt] 
Length = 296 

Score = 53.5 bits (126), Expect * le-06 

Identities = 42/149 (28%) , Positives « 70/149 (46%) , Gaps - 9/149 (6%) 

Query: 34 I AS NTVGNG KT S WAVRLLQR YLAE TALDGR I VE KG MF WS AQLLTE FG D YNYFQTMQ E FL 93 

+ S G GK+ A+ +L+ LTL ++ V + F + + F + + F+ 

Sbjct: 155 WS G PAGTG KS HLAMS I L KDCLQHTD LT - - VI FASWS EVLHL I KD S FDNKDS FYSTEYFM 212 
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Query: 94 ERFERLKTCELLVIDEIGGGSLTKASYPYLYDLVNYRVDNNI^TIYTTNYTDDEIIDLLG 1S3 

E F + +LLVTD+IG +T+ S L ++++ R TI TTN DEI 

Sbjct: 213 E VF RNTDLLVI DD I G S EKI TEWSMS LLTE VLDART KTI ITTNLKSDE IRKKYH 265 

Query: 154 QRLYSRITOTS WLDFQASNVRGLEVS E I 182 

R YSR++ F N++ VS++ 

Sbjct: 266 NRTYSRLFRGIGKKAFNFENIKDKRVSQL 294 

Query* sid| 114859 | laa|dplORF038 Phage dpi ORF| 1350-1871 | 3 
(173 letters) 

>sp|P44123|YB90_HAEIN HYPOTHETICAL PROTEIN HI1190 >gi | 1074675 | pir | | F64021 hypothetical 
protein Hill 90 - Haemophilus influenzae (strain Rd KW20) 
>gi | 1574117 (U32798) 6-pyxuvoyl tetrahydrobiopterin 
synthase, putative {Haemophilus influenzae Rd] 
Length * 141 

Score a 100 bits (247) , Expect » 6e-21 

Identities = 59/143 (41%), Positives = 83/143 (57%), Gaps « 10/143 (6%) 

RVS KTLT FDAAHQLVGH FGKCANLHGHTYKVE I S LAGGTYDHGSSQGMWDFYHVKKI A- 6 0 
++SK +FD AH L GH GKC NLHGHTYK+++ ++G Y G+ + MV+DF +K I 
KISKEFSFDMAHLIiDGHDGKCQNLKGHTYKLQVEIS 62 

GTFIDRLDHAVLL - QGNEP 1 ALANAVDTKRVLFG FRTTAENMSRFLTWTLTELMWK 115 

+D +DHA + Q HE L +++K . FRTTAE ++RF+ L + 

KvTLDPMDHAFIYDO/TNERESQIATLLQKLNSKTFGVPFRTTAEEI^^ 120 

HARIDSIKLWETPTGCAECTYYE 138 
I SI+LWETPT + C Y E 



Query: 


2 


Sbjct: 


3 


Query: 


61 


Sb j ct : 


63 


Query: 


116 


Sbjct: 


121 



Query* sid| 114860 | lan|dplORF039 Phage dpi ORF| 3306-3803 | 3 
(165 letters) 

>emb|CAA68244| (X99978) ORF7; hydophobic protein (Lactobacillus plantaxum) 
Length = 168 

Score = 64.4 bits (154), Expect ~ Se-10 

Identities = 49/156 (31%) , Positives = 84/156 (53%) , Gaps » 9/156 (5%) 

Query: 8 WLVRTALIAALYVTLTVAFSAISY- -GPIQFRVSEALILLPLWNHRWTPGIVLGTIIANF 65 

W++ AL+AA+YV L + +A S G IQFRVSE L L ++N ++ GIV G 1+ + 
Sbjct: 9 WIIN-ALVAAMYVVI^I^PAAFSLASGAIQFRVSEGIiNHIAVFNRXYIWGIVAGVILFDA 67 

Query: 66 FSP-LGLIDVLFGSLATFLGXXXXXXXXXXXSPLYSLICPVLA NAYLIALELRIVY 120 

F P L++VLFG + L f+ + +A + ++IAL + ++ 

Sbjct: 68 FG PG AS LLNVLFGGGQS LLALXiVLTWLAP KLKTVWQRMLLNI AL FTVSMFM I ALMITMMS 127 

Query: 121 S-LPFWESVIYVGISEAIIVLISYFLISTLAKNNHF 155 

S + FW + + +SE 11+ 1+ ++ +L + HF 
Sbjct: 128 SGVAFWPTYLTTALSELIIMSITAPIMYSLDRVIiHF 163 



Query;* sid| 114862 j lan | dplORF041 Phage dpi ORF| 8208-8699 1 3 
(163 letters) 

>gi | 2522313 (AF012906) dUTPase homolog [Bacillus subtil is] 
>gi|2634394|emb|CAB13893| (299114) similar to 
deoxyuridine 5 1 -triphosphate nucleotidohydrolase 
[Bacillus subtilis] >gi| 3025643 (AF020713) putative 
dUTPase (Bacteriophage SPBC2J 
Length » 142 

Score o 108 bits (267) , Expect * 2e-23 

Identities * 65/160 (40%), Positives * 83/160 (51%), Gaps - 25/160 (15%) 

Query: 5 VDVKMI D PKLDRLKYT - - GDWVD VR I SS ITK I DADS ADVSRCRKVLQKAQVYSVAAGEC I 62 

+ +K +D R+ GDW+D+R + I D + 
Sbjct: 3 IKIKYLDETQTRINKMEQGDWIDLRAAEDVAIKKDEFKL -- 41 

Query: 63 KIAHGFALELPKGYEAILHPRSSLFKKTGLIFVSS-GVIDEGYKGDTDEWFSVWYATRDA 121 

+ G A+ELP+GYEA + PRSS +K G+I +S GVIDE YKGD D WF YA RD 
Sbjct: 42 -VP LGVAMELP EG YEAHVVP RS ST YKN FGV I QTNSMGVT D E S Y KGDND FW F F PA Y ALRDT 100 
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Query 122 DIFYDQRIAQFRIQEKQPAIKFNFVESLGNAARGGHGSTG 161 

Y I ri QFR1 +K PA+ V+ LGN RGGHGSTG 

Sbjct: 101 KIKKGDRICQFRIMKKMPAVDLIEVDRLGNGDRGGHGSTG 140 

Query* sidlll4867|lan|dplORF046 Phage dpi ORF | 42774-43202 | 3 
(142 letters) 

>enb|CAB07984| (Z93946) hypothetical protein [bacteriophage Dp-1] 
Length =142 

Score - 287 bits (728), Expect = 2e-77 

Identities - 142/142 (100%), Positives » 142/142 (100%) 

Ouerv- 1 MPMWLNDTAVLTTI ITACSGVLTVLIiNKLFEWKSNKAKSVLEDI STTLSTLKQQVDGIDQ 60 
QaerY ' 1 MPIWLNOTAVLTTIITACSGVLTV^ _ 
Sbjct: 1 MHMLOTTAVLTTIITACSG^^ 60 

— 61 ^a= 120 

Sbjct: 61 TTVAINHQNDVIQDGTRKI QRYRLYHDLKREVXTGYTTLDHFRELS I LFES YKNLGGNGE 120 

Query: 121 VEALYEKYKKLPIREEDLDETI 142 

VEALYEKYKKLPIREEDLDETI 
Sbjct: 121 VEALYEKYKKLPIREEDLDETI 142 

Query- aid) 114901 | lan |dplORF0 80 Phage dpi ORF| 42490-42759 | 1 
(89 letters) 

>emb|CAB07983| (Z93946) hypothetical protein [bacteriophage Dp-l] 
Length = 124 

Score » 147 bits (367) , Expect = le-35 

Identities = 75/75 (100%) , Positives = 75/75 (100%) 

Query- 1 MLNLTKSRQIVAEFTIGQGAEKKLVKTTIVNIDANAV 60 
^ ^ KLNLTKSRQIVAEFTIGQGAEKKLVKTTIVNIDANAVSWSETLro 

Sbjct: 1 MUiLTKSRQIVftBm^ 60 

Query: 61 EQKLRETRYAI EDEI 75 

EQKLRETRYAIEDEI 
Sbjct: 61 EQKLRETRYAIEDEI 75 

Query- aid | 114912 | lan jdplORFO 91 Phage dpi ORF| 43189-43413 | 1 
(74 letters) 

>emb|CAB07985| (Z93946) holin [bacteriophage Dp-1] 
Length « 74 

Score « 63.2 bits (151), Expect = 2e-10 
Identities - 34/74 (45%) , Positives = 34/74 (4S%) 

Query: 1 MKLSNEQYDXXXXXXXXXXXXXXXXXXXXXXXYQFD 60 

MKXiSNEOYD YQFD viAivoaR 

Sbjct: 1 MKLSNEQYDVAK1TVVTVWPAAIALITGLGALYQFDTTAITGTI 60 

Query: 61 NYQKEQEAQNNEVE 74 

NYQKEQEAQNNEVE 
Sbjct: 61 NYQKEQEAQNNEVE 74 
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467 


e-130 


427 


e-118 


309 


le-82 


306 


7e-82 


279 


6e-74 


220 


3e-56 


58 


4e-07 


58 


4e-07 


58 


4e-07 


58 


4e-07 


57 


8e-07 


57 


8e-07 


57 


le-06 


57 


le-06 


54 


7e-06 


S3 


9e-06 


S3 


9e-06 


53 


9e-06 
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Condensed listing of homology information from above 

Phage: dpi 
Database : nr 
Program: Blastp 

Query= sid| 114822 | lan |dplORF001 Phage dpi ORF| 36698-40390 | 2 
(1230 letters) 

gi|2444l24 (U88974) ORF45 [Streptococcus thermophilus temperate .. 
gi|928828 (L44593) ORF1904; putative (Lactococcus lactis phage B. . 
gi | 2935676 (AF032121) unknown tStreptococcua thertnophilus bacter.. 
gi|2935691 (AF032122) unknown [Streptococcus thertnophilus bacter. . 
gi|3540289 (AF057033) putative anti-receptor [Streptococcus ther. . 
gi I 4530154 |gb|AAD21894.l| (AF085222) putative tail-host specific. 
gi|930045|emb|CAA33387| (X1S332) alpha-1 (III) collagen [Homo sa. . 
gi | 1070603 |pir | | CGHU7L collagen alpha l(III) chain precursor - h.. 
gi|4S029Sl|ref j NP_000081 . 1 | PC0L3A1 | collagen, type III, alpha 1 .. 
gi|115290|sp|P04258tCA13_BOVIN COLLAGEN ALPHA l(III) CHAIN >gi|7.. 
gi| 575322 |erab|CAA36279| (X52046) type III collagen [Mus musculusj 
gi|2119163|pir| |S59856 collagen alpha l(III) chain precursor - ra. . 
gi|543912isp|P1394l|CA13_RAT COLLAGEN ALPHA l(III) CHAIN >gi|S43.. 
gil 3171998 lembjCAA06510| (AJ005395) collagen alpha 1 (III) [Ratt.. 
gi|3947565|emb|CAA90250| (Z49967) similar to collagen; cDNA EST ., 
gi(423403|pir| |A46053 bullous pemphigoid antigen, BPAG2, type XV.. 
gi|H5410|sp|P12114|CCSl_CAEEL CUTICLE COLLAGEN SQT-1 >gi|84437|. 
gi|387380l|emb|CAA90084| (Z49907) cuticle collagen SQT-1; cDNA E. 

Query= sid| 114823 | lan| dplORF002 Phage dpi ORF( 32386-35835 | 1 
(1149 letters) 

gi|3341922|dbj|BAA31888| (AB009866) orf IS [bacteriophage phi PVLl 280 3e-74 

gi I 4126622 |dbj|BAA36642.l| (AB016282) ORF3S [bacteriophage phi-105) 232 le-S9 

gill369948|emb|CAA59194| (X84706) host interacting protein [Bact. . . 201 3e-50 

gi | 3139112 (AF063097) gpT [Bacteriophage P2J 188 2e~46 

gi 3337272 (U32222) G protein [Bacteriophage 186] 161 3e-38 

gi|4063799ldbj|BAA36253| (AB008550) orf25; similar to T gene of ... 159 8e-38 

gi [3172274 (AF022214) minor tail subunit; putative tape-measure ... 123 6e-27 

gi|46S127|sp|Q0S233|VG26_BPMLS MINOR TAIL PROTEIN GP26 >gi|41904... 108 2e-22 

gi|3540284 (AF0S7033) putative minor tail protein [Streptococcus . 99 2e-19 

gi[2444H9 (U88974) ORF40 [Streptococcus thermophilus temperate ... 90 6e-17 

gi|2634555|emb|CAB140S3l (Z99115) yoml [Bacillus subtilis] >gi|3... 66 le-09 

gi I 2392838 (AF011378) unknown [Bacteriophage ski) 64 5e-09 

gi|2764 873!emb|CAA66557| (X97918) gene 18.1 [Bacteriophage SPP1) 62 3e-08 

gi[ 1353559 (U38906) ORF42 [Bacteriophage rlt) 61 6e-08 

gi 630841 I pir | (S39079 puff C-8 protein - fungus gnat {Rhynchosci . . . 55 2e-06 

gi|l730865|splP51731|YO27_BPHPl HYPOTHETICAL 72.8 KD PROTEIN IN ... S3 8e-06 

gi|22428S|prf j (1101273J ORF 7 [Bacteriophage HP1] 53 le-05 

Query* sid| 114824 | lan |dplORF003 Phage dpi ORF| 53538-55877 1 3 
(779 letters) 

gi | 118825 | 3p | P00582 | DPOl_EC0LI DNA POLYMERASE I (POL I) >gi|6705... 193 3e-48 

gi|2982102|pdb|lKFSlA Chain A, All-Oxygen Dna Complexed To The 3... 193 3e-48 

gi|229889|pdb|lDPl| DNA Polymerase I (Klenow Fragment) (E.C.2 193 3e-4 8 

gilll69402|sp|P4374l|DPOl_HAEIN DNA POLYMERASE I (POL I) >gi|l07... 191 le-47 

gi|2688462 (AE0011S6) DNA polymerase I (polA) [Borrelia burgdorf . . . 190 3e-47 

gi|809l80|pdb|lKLN)A Escherichia coli 190 3e-47 

gi| 1913934 |erab|CAA72997| (Y12328) DNA-directed DNA polymerase I ... 189 8e-47 

gi|4090935 (AF028719) DNA polymerase type I [Rhodotherraus sp. 'I... 175 le-42 

gi|473157l|gb(AAD28505.l|AF121780__l (AF121780) DNA polymerase I ... 174 2e-42 

gi|1633576 (U57757) similar to proof reading 3 '- 5 « exonuclease an. . . 173 4e-42 

gi | 3322368 (AE00119S) DNA polymerase I (polA) [Treponema pallidumj 172 9e-42 

gij 1006595 |dbj|BAA10748| (D64005) DNA polymerase I [SynechocyBti . . . 171 2e-41 

gi| 585062 (sp|Q07700|DPOl MYCTU DNA POLYMERASE I (POL I) >gi|4161... 163 5e-39 

gi|4376908|gb|AAD1875l| 7aE00164S> DNA Polymerase I [Chlamydia p. . . 157 2e-37 

gi | 1169403 | sp j P46835 | DP01_MYCLE DNA POLYMERASE I (POL I) >gi|l07... 152 7e-36 

gi|2145839|pir| |S72949 DNA polymerase I - Mycobacterium leprae >... 152 7e-36 

gi|140S438|emb|CAA67184| (X9857S) DNA- dependent DNA polymerase [... 152 9e-36 

gi | 2506365 | sp | P80194 | DP01_THECA DNA POLYMERASE I , THERMOSTABLE {... 147 2e-34 

gi (3328929 (AE001322) DNA Polymerase I [Chlamydia trachomatis) 147 3e-34 
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gi|3913510|sp|O52225|DPOl_THEFI DNA POLYMERASE I, THERMOSTABLE (... 
gij 1205984 (U33536) DNA polymerase I [Bacillus stearothermophilus} 
gi 1 118827 | sp j P13252 | DP01_STRPN DNA POLYMERASE I (POL I) >gi|9802... 
gij 1942202 |pdb| 1JXE| Stoffel Fragment Of Taq Dna Polymerase I 
gi 1 1943520 jpdbj 1FCTQ| Dna Polymerase 

gij 1084022 jpir j (JX0359 DKA-directed DNA polymerase (EC 2.7.7.7) ... 
gij 507891 |dbj|BAA06775| (D32013) DNA Polymerase [Thermus aquaticus] 
gi | 118828 j sp| P19821 | DP01_THEAQ DNA POLYMERASE I , THERMOSTABLE (T. . . 
gij 1706502 |sp|P52028|DP01JTHETH DNA POLYMERASE I, THERMOSTABLE (... 
gi j 1097211 jprf j |2113329A DNA polymerase [Thermus aquaticus therm... 
gi j 2098289 jpdbjlTAU| A Chain A, Structure Of Dna Polymerase 

Query- sid| 114825 | lanj dplORF004 Phage dpi ORF| 40401-42440 | 3 
(679 letters) 

gi|l93476l|emb|CAB0798l| (Z93946) hypothetical protein (bacterio. . . 1011 0.0 

gij 3540290 (AF057033) putative minor structural protein [Strepto. . . 346 2e-94 

gij244412S (US8974) ORF46 [Streptococcus thennophilus temperate ... 339 3e-92 

gijl934762|emb|CAB07982| (Z93946) hypothetical protein (bacterio... 300 2e-80 

gij4530155jgb|AAD21895.l| (AF085222) unknown [Streptococcus ther. . . 276 4e-73 

gij 2935677 (AF032121) unknown [Streptococcus thermophilus bacter. . . 250 3e-65 

gi|293S692 (AF032122) unknown [Streptococcus thermophilus bacter... 250 3e-65 

gij 1136289 (U42597) histidine kinase A [Dictyostelium discoideum] 50 7e-05 

Query* sid| 114827 | lan |dplORF006 Phage dpi ORF| 45296-46987 | 2 
(563 letters) 

gi|4377165|gb|AAD18987| (AE001666) SWI/SNF family helicase_2 [Ch. . . 171 le-41 

gij 1769947 (emblCAA67095 | (X98455) SNF [Bacillus cereus] 160 3e-38 

gij3329163 (AE001341) SWF/SNF family helicase [Chlamydia trachom. . . 159 6e-38 

gij4377149jgb|AAD18973| (AE001664) SWI/SNF family helicase_l [Ch. . . 157 2e-37 

gij 3328995 (AE001326) SWI/SNF family helicase [Chlamydia trachom. . . 153 2e-36 

gi j 24 93 3 54 | sp | P75 093 | Y018_MYCPN HYPOTHETICAL HELICASE MG018/MG01 . . . 146 4e-34 

gi j 1653748 jdbj |BAA18659| (D90916) helicase of the snf2/rad54 fam. . . 143 3e-33 

gij 1763712 jembjcAB05939j (Z83337) member of the SNF 2 helicase fa... 143 4e-33 

gij 2636153 jembjcAB15645.lt (Z99122) similar to SNF2 helicase [Ba. . . 143 4e-33 

gij29095S2jembjcAA17284| (AL021924) helZ (Mycobacterium tubercul . 140 2e-32 

gij3844627 (U39681) ATP-dependent RNA helicase, putative [Mycopl. . . 136 3e-31 

gijl351463|sp|P47264|Y018_MYCGE HYPOTHETICAL HELICASE MG018 136 4e-31 

gij2660669 (AC002342) human Mi -2 autoantigen- like protein [Arabi... 131 2e-29 

gi | 1361537 |pir | | 164201 helicase (motl) homolog - Mycoplasma geni . 129 4e-29 

gij 3482977 j erab jcAA20533 . 1 | (AL031369) putative protein [Arabidop. . . 128 9e-29 

gij 3298562 (U91543) zinc-finger helicase [Homo sapiens] 120 2e-26 

gi|387597l|emb|CAB0249l| (Z80344) similar to helicase; cDNA EST ... 120 2e-26 

gi j 4 557451 |refjNPJ)01263.1|PCHD3| chromodomain helicase DNA bind... 120 2e-26 

gij 2645435 (AF007780) CHD3 (Drosophila melanogaster] 118 le-25 

gij3875165|emb|CAA91798| (Z67881) Similarity to Mouse Chromodoraa .. . 118 le-25 

Query* sid| 114828 | lan |dplORF007 Phage dpi ORF| 22230-23621 | 3 
(463 letters) 

gi|2444105 (U88974) ORF26 [Streptococcus thermophilus temperate ... 89 7e-17 

gij3318666 (U19754) BBA31 homolog [Borrelia burgdorferi] 59 7e-08 

gi|2690260 (AE000790) conserved hypothetical protein [Borrelia b... 56 5e-07 

Query* sid| 114829 | lan| dplORF008 Phage dpi ORF| 49624-50961 | 1 
(445 letters) 

gi | 4406210 | gb| AAD19901 | (AF100420) DnaB replication fork helicas... 68 2e-10 

gij 3121983 | spj 025916 | DNAB_HELPY REPLICATIVE DNA HELICASE >gi|231... 67 2e-10 

gij 4416322 jgb|AAD20314 ) (AF106032) replicative helicase; DnaB [B. . . 65 9e-10 

gij4155895 (AE001551) REPLICATIVE DNA HELICASE [Helicobacter pyl .. . 60 4e-08 

gi|3322317 (AE001191) replicative DNA helicase (dnaB) [Treponema... 58 le-07 

gijl3803l| sp|P04530|VG41_BPT4 PRIMASE- HELICASE (PROTEIN GP41) >g . . . 53 3e-06 

gij 2983861 (AE000742) replicative DNA helicase [Aquifex aeolicus] 51 le-05 

Query* sid| 114831 | lan |dplORF010 Phage dpi ORF| 8699-9859 | 2 
(386 letters) 

gi | 2760912 (AF0372S8) RecA protein [Chlorobium tepidum] 133 2e-30 

gi j 3219851 | sp | P94666 | RECA^CLOPE RECA PROTEIN >gi | 1698591 (U61497... 129 3e-29 

gi I 1350566 j spj P4829S j RECA^STRVL RECA PROTEIN >gij 508860 (U04837) . . . 128 7e-29 

gi|744163 |prf j (2014250A recA-like protein (Streptomyces violaceus] 126 3e-28 

gij730487|spjP41054|RECA_STRAM RECA PROTEIN >gi | 511133 | erab ) CAA82 . 125 4e-28 

gi j2687334 | emb | CAA1S87S | (AL020958) RecA protein [Streptomyces c .. . 125 6e-28 

gijl350565 | sp| P48294 | RECA_STRLI RECA PROTEIN >gi | 481482 | pir j | S38 . 125 6e-28 
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gi|464599|sp|P33542|RECA_AQUPY RECA PROTEIN >gi 1 1086167 | pir | | ASS . . 
gi|4l7636|sp|P32725|RECA_RHOSH RECA PROTEIN >gi j S41307 | pir J j S415 . . 
gi| 2984348 (AE000775) recombination protein RecA (Aquifex aeolicus 
gi| 3219854 (spjP9584 6 (RECA_STRRM RECA PROTEIN >gi | 1729800 | emb J CAA. . 
gi) 2500 086 |sp|QS 956 0|RECA_MYCSM RECA PROTEIN >gi j 1430892 j embj CAA . . 
gi| 1350567 )sp|P4 8296 (RECAJTHEAQ RECA PROTEIN >gi j 1072963 j pir j | AS . . 
gi[625663|pir| | JX0292 recA protein - Thermus aquaticus (strain HB8 
gi| 1172880 | Bp |P42440|RECA_CAMJE RECA PROTEIN >gi | 2119991 1 pir | i 14 . - 
gi|41S4654 (AE001453) RECA PROTEIN. [Helicobacter pylori J99] 
gi|1072968|pir| |C55020 recA protein - Thermus ap >gi ( 458472 |dbj | . . 
gij 3219852 |sp|P95469|RECA_PARDB RECA PROTEIN >gi|l825468 (U59631.. 
gi | 2507284 j sp | P42445 | RECA_HELPY RECA PROTEIN >gi j 2313235 | gb | AAD0 . - 
gi| 1172890 |sp)Q02350|RECA_STAAU RECA PROTEIN >gi|46328S (L25893) . . 
gi|4416209|gb|AAD2026l| (AF094756) RecA protein (Bifidobacterium., 
gi | 2500084 | sp | Q591B0 | RECA_BORBU RECA PROTEIN >gi| 1276443 (U23457.. 

Query= sid| 114832 | lan| dplORFOll Phage dpi ORF( 28017-29096 | 3 
(359 letters) 

gi|2444110 (U88974) ORF31 [Streptococcus thermophilic temperate . . 
gi|3320438 (AF057033) gp348 [Streptococcus thermophilus bacterio. . 
gi|479514 jpir| |S34244 hypothetical protein p38 - actinophage VWB. . 

Query= sid| 114834 | lan| dplORF013 Phage dpi ORF|l0215-11240|3 
(341 letters) 

gi|580855|emb|CAA29958| (X06803) dnaZX-like ORF put. DNA polymer, 
gi 1 118807) 9p|P09122 | DP3X_BACSU DNA POLYMERASE III SUBUNITS GAMMA. 
gi[98292(pirj (S13786 DNA~directed DNA polymerase (EC 2.7.7.7) II. 
gi|l527142 (U66040) DNA polymerase III gamma subunit (Salmonella. 
gi|2494197|sp|P74876|DP3X_SALTY DNA POLYMERASE III SUBUNITS GAMM. 
gi|H8808|sp|P06710|DP3X ECOLI DNA POLYMERASE III SUBUNITS GAMMA, 
gi 14155207 (AE001497) DNA POLYMERASE III SUBUNITS GAMMA AND TAU . 
gi| 23 13841 |gb|AAD07767.l| (AE000584) DNA polymerase III gamma an. 
gi | 2583049 (AF025391) DNA polymerase III holoenzyine tau subunit . 
gi | 2984127 (AE000759) DNA polymerase III gamma subunit [Aquifex . 
gi|3861390|emb|CAA15289| (AJ235273) DNA POLYMERASE III SUBUNITS . 
gi|1169397jsp|P43746|DP3X_HAEIN DNA POLYMERASE III SUBUNITS GAMM. 
gi|l293572 (U49738) DNA polymerase III tau homolog DnaX [Cauloba. 
gi|3328753 (AE001306) DNA Pol III Gamma and Tau (Chlamydia trach. 
gi|4376294|gb|AAD18193| (AE001589) DNA Polymerase III Gamma and . 
gi| 581255 | emb (CAA28175| (X04487) alternate dnaZX protein (AA 1-6. 
gi 1 2688379 (AE001151) DNA polymerase III, subunits gamma and tau. 
gi(3323329 (AE001268) DNA polymerase III, subunits gamma and tau. 

Query= sid|ll483S|lan|dplORF014 Phage dpi ORF| 50961-51974(3 
(337 letters) 

gi|1346796|sp|P47492|PRIM_MYCGE DNA PRIMASE >gi | 1361496 | pir | | F64 . . 
gij 740008 (prf | (2004290A primase [Haemophilus influenzae] 
gi| 1172619 |sp|Q08346|PRIM_HAEIN DNA PRIMASE >gi | 1074033 | pir | |A64 . . 
gij 1709769 |sp|Q04505|PRIM_LACLA DNA PRIMASE >gi j 1075726 j pir j j JC2 . . 
gi j 639846 |dbj (BAA03516 | (D14690) DNA primase (Lactococcus lactisj 
Query- sid| 114837 | lan| dplORFOlS Phage dpi ORF|43413-44303|3 
(296 letters) 



123 
123 
123 
122 
122 
122 
121 
120 
120 
120 
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119 
118 
116 
118 



187 
179 
62 



182 
182 
182 
172 
172 
170 
169 
168 
166 
166 
165 
156 
151 
148 
148 
146 
140 
137 



57 
51 
51 
51 
51 



2e-27 
2e-27 
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4e-27 
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2e-26 
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2e-44 
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2e-45 
2e-45 
2e-45 
4e-42 
4e-42 
le-41 
2e-4i 
4e-41 
3e-40 
3e-40 
5e-40 
2e-37 
8e-36 
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3e-34 
2e-32 
le-31 



2e-07 
le-05 
le-05 
le-OS 
le-05 



gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 

gi 
gi 

gi 
gi 
gi 



|l934766|emb|CAB07986| (293946) N-acetylmuramoyl-L-alanine ami. 
|H3676|sp|P06653|ALYS_STRPN AUTOLYSIN (N- ACETYLMURAMOYL- L-ALA . 
|282326|pir| |A42935 N-acetylmuramoyl-L-alanine amidase (EC 3.5. 
( 416618 jsp|P32762|ALYS_BPHB3 LYTIC AMIDASE (N-ACETYLMURAMOYL-L . 
[285273 |pir| (A42936 N-acetylmuramoyl-L-alanine amidase (EC 3.5. 
|l27787|sp|P15057|LYCA_BPCPl LYSOZYME (ENDOLYSIN) (MURAMIDASE) . 
j 67761 |pirj JMUBPCP N-acetylmuramoyl-L-alanine amidase (EC 3.5.. 
|l27789|sp|P19386|LYCA_BPCP9 LYSOZYME (ENDOLYSIN) (MURAMIDASE). 
(928832 (L44593) ORF259; putative [Lactococcu3 lactis phage BK. 
|251170S(emblCAA71783| (Y10818) sigA binding protein (Streptoc. 
(4097980 (U72655) surface protein C (Streptococcus pneumoniae] 
|2351768 (U89711) PspA [Streptococcus pneumoniae] 
|2425109 (AF019904) choline binding protein A (Streptococcus, p. 

282335|pir| (A41971 surface protein pspA precursor - Streptoco. 
j2S7633l|emb|CAA05158( (AJ0020S4) SpsA protein [Streptococcus . 

2127295|pirj | S57962 cspC protein - Clostridium acetobutylicum. 
i2576333|erab(CAA05159| (AJ002055) SpsA protein (Streptococcus . 
(4106522|gb(AAD02874.l| (AF097909) excreted protein FibB [Pept. 
(1361406 | pir | (S57714 cspB protein - Clostridium acetobutylicum. 
|l914872|emb|CAB04758| (Z82001) PCPA [Streptococcus pneumoniae] 
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gi I 3168594 | dbj |BAA28613| (AB012763) SpaA [Erysipelothrix rhusiop. . . 81 le-14 

gi|22927S0|erab|CAA64942| (X95646) homology to orf259 of lactococ. . . 80 3e-14 

gi|2935696 (AF032122) putative lysin {Streptococcus thermophilus. . . 80 3e-14 

gi|4566910[dbj |BAA76540.lj (AB017447) protective antigen SpaA. 1 ... 80 3e-14 

gi|3540294 (AF057033) lysin (Streptococcus thermophilus bacterio. . . 79 5e-14 

Query= sid|U4B4l|lan|dplORF020 Phage dpi 0RF| 1864-26SB | 1 
(264 letters) 

gi|2633745|emb|CAB13247| (Z99111) similar to coenzyme PQQ synthe . . . 217 5e-56 

gil 2808502 |emb|CAA12532| (AJ22S561) ExsD protein [Sinorhizobium ... 163 le-39 

gi|386115l|emb|CAA150Sl| (AJ235272) unknown [Rickettsia prowazekii] 82 6e-15 

gi|l6S2793 dbj | BAA17712 j (D90908) hypothetical protein [Synechoc... 76 3e-13 

gi|l723815|sp|P55139|YGCF_ECOLI HYPOTHETICAL 25.0 KD PROTEIN IN . . . 70 2e-ll 

gij 2984272 (AE000769) hypothetical protein [Aquifex aeolicus] 66 4e-10 

gi |4155435 (AE001516) putative [Helicobacter pylori J99] 57 le-07 

gi | 2127833 | pir | (C64505 coenzyme PQQ synthesis protein III homolo... 55 5e-07 

gi|2622333 (AE000890) coenzyme PQQ synthesis protein III [Methan. . . 54 9e-07 

gi|32S7042[dbj |BAA29725| (AP000003) 2S4aa long hypothetical prot .. . 53 2e-06 

gi|2314068|gb|AAD07976.l| (AE000602) conserved hypothetical prot... 52 6e-06 

gi| 1723816 |sp|P4S097|YGCF_HAEIN HYPOTHETICAL PROTEIN HI1189 >gi|... 50 2e-05 

Query= sidf 114842 1 lan|dplORF021 Phage dpi ORF| 2504-3295 | 2 
(263 letters) 

gi 1 1274 81 1 sp | P1946S (GCH1 BACSU GTP CYCLOHYDROLASE I (GTP-CH-I) >... 208 4e-S3 

gi|3242315|emb|CAA04237r(AJ000685) GTP cyclohydrolaae (Streptoc... 191 4e-48 

gi j 2494695 | sp j Q54769 | GCH1_SYNP7 GTP CYCLOHYDROLASE I (GTP-CH-I) ... 189 2e-47 

gi|2S506llbbslll2832 (S44049) GTP cyclohydrolaae I {clone hGCH-1 .. . 187 7e-47 

gi|4503949|ref |NP_000152 . 1 | PGCH1 | GTP cyclohydrolaae 1 (dopa-res... 187 7e-47 

gi| 2113967 |emb|CAB08935| (Z95SS7) folE [Mycobacterium tuberculosis] 187 7e-47 

gi 1 1730240 | sp | P50141 |GCH1_CHICK GTP CYCLOHYDROLASE I (GTP-CH-I) ... 185 3e-46 

gi | 2494696 | sp | Q5S759 |GCH1_SYNY3 GTP CYCLOHYDROLASE I (GTP-CH-I) ... 184 5e-46 

gi|l2106l|sp|P22288|GCHl_RAT GTP CYCLOHYDROLASE I PRECURSOR (GTP... 184 6e-46 

gi|3183014 | sp | 013774 | GCH1 SCHPO GTP CYCLOHYDROLASE I (GTP-CH-I) ... 184 6e-46 

gi|3097224 embl CAA18795 | TAL023093) GTP cyclohydrolase I [Mycoba... 182 2e-45 

gi 2494697|sp|Q19980|GCHl CAEEL PROBABLE GTP CYCLOHYDROLASE I (G. . . 182 2e-45 

gi | 462167 |sp|Q05915|GCHl_MOUSE GTP CYCLOHYDROLASE I PRECURSOR (G. . . 180 7e-45 

gi|l669664|emb|CAA89808| (Z49706) GTP cyclohydrolase I [Dictyost... 180 le-44 

gi|2981082 (AF05204B) GTP-cyclohydrolase [Ostertagia ostertagij 178 3e-44 

gi|31954|emb|CAA78908| (Z16418) GTP cyclohydrolase I [Homosapi... 177 8e-44 

gi|551344|bbs|l50280 (S71373) GTP cyclohydrolase I [mice, Peptid... 174 5e-43 

gi |1730247 1 sp | P51601 1 GCH1_YEAST GTP CYCLOHYDROLASE I (GTP-CH-I) ... 174 7e-43 

gi|1246912|erablCAA87397l (Z47201) GTP cyclohydrolase 1 [Saccharo... 172 2e-42 

gi 1 1730246 j sp | P51595 1 GCH1_STRPN GTP CYCLOHYDROLASE I (GTP-CH-I) ... 168 3e-41 

gi | 2982951 (AE000680) GTP~cyclohydrolase I [Aquifex aeolicus) 164 6e-40 

Query= sid| 114843 1 lan |dplORF02 2 Phage dpi ORF| 30896-31675 | 2 
(259 letters) 

gi (2347102 (U77367) internal in [Listeria monocytogenes] 
gij 3123226 (sp|P25146 | INLA_LISMO INTERNAL IK A PRECURSOR >gi|48705... 
gi | 149674 (M67471) internalin [Listeria monocytogenes] 

Query= sid|H48S0| lan|dplORF029 Phage dpi ORF 1 662-134 8 1 2 
(228 letters) 

gi | 2650185 (AE001074) succinoglycan biosynthesis regulator (exsB. . . 
gi|386123l|emb|CAA1513l| (AJ235272) unknown [Rickettsia prowazekii] 
gi|2622210 (AE000881) conserved protein [Methanobacteriura thermo. . . 
gij 2983380 (AE000709) trans -regulatory protein ExsB [Aquifex aeo. . . 
gi | 1001327 )dbj (BAA10814) (D64006) ExsB [Synechocystis sp.] 
gi|2128055|pir| |B64468 hypothetical protein homolog MJ1347 - Met... 
gi | 4155143 (AE001491) putative [Helicobacter pylori J99] 
gij2313760|gb|AAD07701.l| (AE000578) conserved hypothetical prot .. . 
gi|2120814|pir| |S60183 protein ExsB - Rhizobium meliloti >gi(ll4... 
gi| 2633743 |emb|CAB13245| (Z99111) similar to hypothetical protei... 
gi| 1175543 |sp|P44124|YBAX HAEIN HYPOTHETICAL PROTEIN HI1191 >gi|... 
gi|2495537|sp|P77756!YBAXlECOLI HYPOTHETICAL 25.5 KD PROTEIN .IN ... 
gi 1 3256471 j dbj (BAA29154 .l| (AP000001) 269aa long hypothetical pr... 
gi|2921156 (AF022216) aluminum resistance protein [Arthrobacter ... 

Query= sid| 114855 | lan ldplORF034 Phage dpi ORF | 131-652 | 2 
(173 letters) 

gi)2633746|emb|CAB13248| (Z99111) similar to hypothetical protei .. . 
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gi | 4155926 (AE001554) putative (Helicobacter pylori J99] 162 le-39 

gij 2314588 |gb)AAD08456 . 1| (AE000642) conserved hypothetical prot . 161 3e-39 

gi (2983458 (AE000714) hypothetical protein [Aquifex aeolicus) 103 9e-22 

gij 1006604) dbj |BAA10757| (D64005) hypothetical protein (Synechoc... 87 6e-17 

gi j 2967529 (U11045) unknown [Buchnera aphidicola} 79 2e-14 

gi|2495654|sp|Q46920|YQCD_ECOLI HYPOTHETICAL 32.6 KD PROTEIN IN . . . 69 2e-ll 

gi j 1175604 | sp| P44153 | YQCD_HAEIN HYPOTHETICAL PROTEIN HI1291 >gi | . . . 63 le-09 

gi [386064 2 |emb|CAA14 543 | Taj235270) unknown [Rickettsia prowazekii) 56 le-07 

Query= sid| 114857 | lan |dplORF036 Phage dpi ORF| 48808-49362 | 1 
(184 letters) 

gi 1 1353529 (U38906) ORF12 (Bacteriophage rlt] 53 le-06 

Query* sid| 114 859 | lan |dplORF03 8 Phage dpi ORF | 1350-1871 | 3 
(173 letters) 

gi 1 1175542 | sp| P44123 j YB90_HAEIN HYPOTHETICAL PROTEIN HI1190 >gi | . . . 100 6e-21 

gij 2982977 (AE000681) hypothetical protein (Aquifex aeolicus] 67 7e-ll 

gi | 3860744 [ emb | CAA1464 5 | (AJ235270) unknown [Rickettsia prowazekii) 65 3e-10 

gij 2650193 (AE001074) conserved hypothetical protein (Archaeoglo. . . 58 4e-08 

gi|3258383|dbj|BAA31066.l| (AP000007) 15 7aa long hypothetical pr . 55 2e-07 

gi | 1001713 | dbj |BAA105S0| (D64004) hypothetical protein [Synechoc... 50 8e-06 

gi| 4155434 (AE001516) putative [Helicobacter pylori J99] 50 le-05 

Query=* sid| 114860 | lan | dplORF03 9 Phage dpi ORF 1 3306-3803 1 3 
(165 letters) 

gi|l922884|emb|CAA68244| (X99978) ORF7; hydophobic protein [Lact... 64 5e-10 

Query= aid j 114862 1 lan [dplORF041 Phage dpi ORF 1 8208-8699 1 3 
(163 letters) 

gij 2522313 (AF012906) dUTPaae homolog (Bacillus subtilis] >gi|26... 108 2e-23 

gil2634150|emb|CAB13650l (Z99113) similar to deoxyuridine 5 ' -tri, . . 108 3e-23 

gi { 3913546 j sp 1 054134 j DUT_STRCO DEOXYURIDINE 5 ' -TRIPHOSPHATE NUCL. . . 56 2e-07 

gi|3913542|spjo48500jDUT_BPTS DEOXYURIDINE 5 ' -TRIPHOSPHATE NUCLE .. . 52 3e-06 

gi j 3 913 54 8 | Sp j 06 8 9 92 | DUT_CHLTE DEOXYURIDINE 5 * -TRIPHOSPHATE NUCL . . . 50 le-05 

Query* aid | 114867 | lan | dplORF04 6 Phage dpi ORF | 42774-43202 | 3 
(142 letters) 

gi|1934764|erab|CAB07984| (Z93946) hypothetical protein [bacterio... 287 2e-77 

Query= aid | 114 901 | lan |dplORF0 80 Phage dpi ORF | 42490-42759 | 1 
(89 letters) 



gi|l934763|emb|CAB07983 j (Z93946) hypothetical protein (bacterio... 147 le-3S 

Query* sid| 114912 | lan|dplORF091 Phage dpi ORF | 43189-43413 | 1 
(74 letters) 



gi|l934765|erab|CAB07985| (Z93946) holih [bacteriophage Dp-1] 



63 



2e-10 
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Table 32 

Sequence of Dpi published by Sheehan and al.. 4731 nucleotides. 

1 ttcaaatttt ttgacaaagt taattcaaat tgtaccgctg aagcaatttt ccatgtattc actcaaagtt 

71 gttcagtgtg gctcaatcat atcaaaatcg aacttggtaa tatctctact ccttttagtg aagcagagga 

141 agaccttaaa tatcgaattg actcaaaagc cgatcaaaag ctaactaacc aacagttgac ggcactcacg 

211 gaaaaggctc aactacatga cgcagaactg aaagccaagg ctacaatgga gcagttaagt aacctagaaa 

281 aggcttatga aggtagaatg aaagccaatg aagaagctat caacaaatcg gaacccgacc taatcttagc 

351 ggcaagtcga attgaagcta ctatccaaga acttggcggg ctacgggaac tgaagaagtt cgtcgacagt 

421 tgcatgagct cttctaatca aggtctaatt atcggtaaga acgacggtag ctctaccatt aaggtatcaa 

491 gtgaccgaat ttctatgttc tccgcaggga atgaagttat gtaccttacg caagggtcca ttcacatcga 

561 caacgggatc tttacccaat ccattcaagt cggccgattt agaacggaac aatactcgtt taatccagac 

631 atgaacgtga ttcggtatgt aggataagga gaataacatg acaaaattta tcaactcata cggccctctt 

701 cacttgaacc tttacgtcga acaagttagt caggacgtaa cgaacaactc cccgcgagtt agttggcgag 

771 ctactgtcga ccgcgatgga gcttatcgaa cgtggactta tggaaatatt agtaaccttt ccgtatggtt 

841 aaatggttca agtgttcata gcagtcaccc agactacgac acgtccggcg aagaggtaac gctcgcaagt 

911 ggagaagtga ctgttcctca caatagtgac gggacaaaga caatgtccgt ttgggcttcg tttgacccta 

981 ataacggcgt tcacggaaat atcactatct ctactaatta cactttagac agtattccaa ggtctacaca 

1051 gatttctagt tttgagggaa atcgaaatct aggatcttta catacggtta tctttaaccg aaaagtgaac 

1121 tcttttacgc atcaagtttg gtaccgagtt ttcggtagcg actggataga tttaggtaag aaccatacta 

1191 ctagcgtatc ctttacgccg tcactggact tagcaaggta cttacctaaa tcaagttccg gaacaatgga 

1261 catctgtatt cgaacctaca acggaactac gcaaattggt agtgacgtct actcaaacgg atggaggttc 

1331 aacatccccg attcagtacg tcctactttt tcgggcattt ctttagtaga cacgacttca gcggttcgac 

1401 agattttaac agggaacaac ttcctccaaa tcatgtcgaa cattcaagtc aacttcaaca atgcttccgg 

1471 cgcttacgga tccactatcc aagcatttca cgctgagctc gtaggtaaaa accaagctat caacgaaaac 

1541 ggcggcaaat tgggtatgat gaactttaat ggctccgcta ccgtaagagc atgggttaca gacacgcgag 

1611 gaaaacaatc gaacgtccaa gacgtatcta tcaatgttat agaatactat ggaccgtcta tcaatttctc 

1681 cgttcaacgt actcgtcaaa atcctgcaat tatccaagct cttcgaaatg ctaaggtcgc acctataacg 

1751 gtaggaggtc aacagaaaaa catcatgcaa attaccttct ccgtggcgcc gttgaacact actaatttca 

1821 cagaagatag aggttcggcg tcagggacgt tcactactat ttccctactg actaactcgt ccgcgaactt 

1891 agctggtaac tacgggccgg acaagtctta catagttaag gctaaaatcc aagacaggtt cacttcgact 

1961 gaatttagtg ctacggtacc taccgaatca gtagttctta actatgacaa ggacggtcga cttggagttg 

2031 gtaaggttgt agaacaaggg aaggcagggt caattgatgc agcaggtgat atacatgctg gaggecgaca 

2101 agttcaacag tttcagctca ccgataataa tggagcattg aacaggggtc aatataacga tgttggaata 

2171 agcgtgaaac agagtttaca tggcgaagta acaaatacga ggacaaccct acgggaactc gaggtgaatg 

2241 gggactattt caaaatttct ggttagatag ctggaaaatg gttcaatcct tcattacaat gtcaggaaga 

2311 atgttcatca ggacagcgaa cgatggaaac agctggagac ctaacaagtg gaaagaggtt ctatttaagc 

2381 aagacttcga acagaataat tggcagaaac ttgttcttca aagtgggtgg aaccatcact caacctatgg 

2451 cgacgcattc tattcgaaaa ctcctgacgg catagtatat ttgagaggaa atgtgcataa aggacttatc 

2521 gacaaagagg ctactattgc agtacttcct gaaggattta gaccgaaagt ttcaatgtat cttcaggctc 

2591 tcaataactc atatggaaat gccattctat gtatacacac tgacggaaga cttgtggtga aatcgaatgt 

2661 agataattct tggttaaatt tagacaatgt ctcatttcgt atttaatttg agctgaaatc atgttataat 

2731 attttttaga aaggaggtga gaactatgtt gaaccttaca aaatcgcgcc aaattgtggc agagttcact 

2801 attggacaag gagctgaaaa gaaacttgtc aaaacaacga ttgtgaacat tgatgcaaac gcagtatcaa 

2871 ccgtctctga aactcttcat gacccagact tgtatgctgc gaaccgtcga gaacttcgag ctgacgagca 

2941 aaaacttcgc gaaactcgtt acgcaatcga agatgaaatt aatagctgga gcgggggaaa aaagggggag 

3011 cccggctcta acaggctgaa taaggaggcg tcaatctatg ccaatgtggc taaacgacac cgcagtcttg 

3081 acgacgatta ttacagcgtg cagcggagtg cttactgtcc tactaaataa gctattcgaa tggaaatcga 

3151 ataaagccaa gagcgtttta gaggatatct ctacaactct tagcactctt aaacagcagg tcgacgggat 

3221 tgaccaaacg acagtagcaa tcaatcacca aaatgacgtc attcaagacg gaaccagaaa aattcaacgt 

3291 taccgtcttt atcacgactt aaaaagggaa gtgataacag gctatacaac tctcgaccat tttagagagc 

3361 tctctatttt attcgaaagt tacaagaacc ttggcggaaa tggtgaagtt gaagccttgt atgaaaaata 

3431 caagaaatta ccaattaggg aggaagattt agatgaaact atctaacgaa caatatgacg tagcaaagaa 

3501 cgtggtaacc gtagtcgttc cagcagcgat tgcactaatt acaggtcttg gagcgttgta tcaatttgac 

3571 actactgcta tcacaggaac cactgcactt cttgcaactt ttgcaggtac tgttctagga gtttctagcc 

3641 gaaactacca aaaggaacaa gaagctcaaa acaatgaggt ggaataatgg gagtcgatat tgaaaaaggc 

3711 gttgcgtgga tgcaggcccg aaagggtcga gtatcttata gcatggactt tcgagacggt cctgatagct 

3781 atgactgctc aagttctatg tactatgctc tccgctcagc cggagcttca agtgctggat gggcagtcaa 

3851 tactgagtac atgcacgcac ggcttattga aaacggttat gaactaatta gtgaaaatgc tccgtgggat 

3921 gctaaacgag gcgacatctt cacccgggga cgcaaaggtg ctagcgcagg cgctggaggt catacaggga 

3991 tgttcattga cagtgataac atcattcact gcaactacgc ctacgacgga atttccgtca acgaccacga 

4061 tgagcgttgg tactatgcag gtcaacctta ctactacgtc tatcgcttga ctaacgcaaa cgctcaaccg 

4131 gctgagaaga aactCggctg gcagaaagat gctactggtt tctggtacgc tcgagcaaac ggaacttatc 

4201 caaaagatga gttcgagtat atcgaagaaa acaagtcttg gttctacttt gacgaccaag gctacatgct 

4271 cgctgagaaa tggttgaaac ataccgatgg aaattggtat tggttcgacc gtgacggata catggctacg . 

4341 tcatggaaac ggattggcga gtcatggtac tacttcaatc gcgatggttc aatggtaacc ggttggatta~~~ 

4411 agtattacga taattggtat tatcgtgatg ctaccaacgg cgacatgaaa tcgaatgcgt ttatccgtta 

4481 taacgacggc tggtatctac tattaccgga cggacgtctg gcagataaac cccaattcac cgtagagccg 

4551 gacgggctca ttactgctaa agtttaaaat atagagagga ggaagctctt ttcttaatat tgtttctctt 

4621 aatcccgcaa ggtttcgacc ccgcggggtt tatgtgtcgt gaattactct atttacttat tcgaagattt 

4691 caattataat taaataatca acgagattca taattggagg aatg 
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Table 33 



Streptococcus accession numbers 
gi|5776553|gb|AF026471 .2|AF026471 [5776553] 
gi)54 1 0470|gb| AF 1 39890. 1 jAFl 39890 [5410470] 
gi|5410468|gb|AF139889.1|AF139889 [5410468] 
gi|5410466|gb|AF139888.1|AF139888 [5410466] 
gi(54 1 0464|gb|AF 1 39887. 1|AF 1 39887 [5410464] 
gi|54 1 0462|gb[AF 1 39886. 1 1 AF 1 39886 [5410462] 
gi(54 1 0460|gblAF 1 398 85. 1 1 AF 1 39885 [5410460] 
gil54 1 0458|gb| AF 1 39884. 1 1 AF 1 39884 [5410458] 
gi|5410456|gb|AF139883.1|AF139883 [5410456] 
gi|3093394|emb|AJ005697.1|SPN5697 [3093394] 
gi|5759208|gb| AF 1 7 1 873 . 1 |AF 1 7 1 873 [5759208] 
gi|575 83 1 1 |gb[AF ! 62664. 1 |AF 1 62664 [5758311] 
gi|5739313|gb|AF16170U|AF161701 [5739313] 
gi|57393 1 0|gb| AF 1 6 1 700. 1 |AF 1 61 700 [5739310] 
gi|5726354|gb| AF 1 59448. 1 |AF 1 59448 [5726354] 
gi|5726290|gb(AF127143.1|AF127143 [5726290] 
gi|5712666|gb[AF140784.1|AF140784 [5712666] 
gi|4218525temb|AJ009639.1|SPAJ9639 [4218525] 
gi|5616524|gb|AF169483.1|AF169483 [5616524] 
gi|5579395|gb]AF162656.1|AF162656 [5579395] 
gi|5579393|gb|AF162655.1|AF162655 [5579393] 
gi|5578890|emb|AJ131985.1|SPN131985 
[5578890] 

gi|5566442|gb|AF 1 67442. 1 |AF 1 67442 [5566442] 

gi|5459332|emb|AJ243540.1|EVE243540 

[5459332] 

gi|5305398|gb| AF0728 11.1 |AF0728 1 1 [5305398] 

gi|5295921|emb|AJ242698.1|SPN242698 

[5295921] 

gi|5295920|emb|AJ242697. 1 |SPN242697 
[5295920] 

gi|52959 19|emb|AJ242696. 1 |SPN242696 
[5295919] 

gi|52959 1 8|emb| AJ242695 . 1 |SPN242695 
[5295918] 

gi|4583522|gb|AF140356.1|AF140356 [4583522] 
gi|523 1 206|gb| AF 1 57826. 1 1 AF 1 57826 [523 1 206] 
gi|523 1 203|gb| AF 1 57825. 1 1 AF 1 57825 [523 1 203] 



gi|523 1 200|gb| AF 1 57824. 1 1 AF 1 57824 [523 1 200] 

gi|523 1 1 97|gb|AF 1 57 823. 1 1 AF 1 57823 [5231197] 

gi|523 1 1 94|gb| AF 1 57 822. 1 1 AF 1 57822 [5231194] 

gi|5231191|gb|AF15782U|AF157821 [5231191] 

gi|5231188|gb|AF157820.1|AF157820 [5231188] 

gi|523 1 1 85|gb| AF 1 578 19. 1 1 AF1 578 1 9 [523 1 1 85] 

gi|523 1 1 82|gb| AF 1 578 1 8. 1 1 AF 1 578 1 8 [523 1 1 82] 

gi|5231179|gb|AF157817.1|AF157817 [5231179] 

gi|43 3685 1 |gb| AF 1 06 1 3 8 . 1 |AF 1 06 1 38 [4336851] 

gi|4336848|gb|AF106137.1|AF106137 [4336848] 

gi|4336845|gb|AF106136.1|AF106136 [4336845] 

gi|4336842|gb|AF106135.1|AF106135 [4336842] 

gi|4336839|gb|AFl06134.1|AF106134 [4336839] 

gi|4336836|gb| AF 1 06 1 33 . 1 1 AF 1 06 1 33 [4336836] 

gi|4336833 |gb| AF 1 06 1 32. 1 |AF 1 06 1 32 [4336833] 

gi|3907597|gb[AF094575.1|AF094575 [3907597] 

gi|5030425|gb|AF061748.21AF061748 [5030425] 

gi|490288 1 |emb|AJ239004. 1 |SPN239004 
[4902881] 

gil500 1 7 1 0|gb| AF 1 1 2358 . 1 1 AF 1 1 2358 [5001710] 

gi|5001690|gb|AF106539.1|AF106539 [5001690] 

gi|497327 1 |gb| AF 144420. 1 |AF 144420 [497327 1 ] 

gi|4973269|gb|AF 1444 19. 1 |AF 1444 19 [4973269] 

gi|4973267|gb|AF 1444 1 8. 1 |AF1444 18 [4973267] 

gi!4928190|gb[AF129757.1[AF129757 [4928190] 

gi|4927743|gb|AF126061.1|AF126061 [4927743] 

g i|4927742|gb|AF 1 26060. 1 1 AF 1 26060 [4927742] 

gi|4927741|gb|AF126059.1|AF126059 [4927741] 

gi|4495247|emb|AJ240675.1|SPN240675 
[4495247] 

gi|4495245|emb|AJ240670.1|SPN240670 
[4495245] 

gi|4495243|emb|AJ240669. 1 |SPN240669 
[4495243] . _ 

gi|449524 1 |emb| AJ240668. 1 |SPN240668 
[4495241] 

gi|4495239|emb|A J240667. 1 |SPN240667 
[4495239] 
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gi|4495237|emb|AJ240666.1|SPN240666 
[4495237] 

gi|4495235|emb|AJ240665.1|SPN240665 
[4495235] 

gi|4495233|emb|AJ240664. 1 |SPN240664 
[4495233] 

gi|4495231|emb|AJ240663.1|SPN240663 
[4495231] 

gi|4495229|emb|AJ240662.1|SPN240662 
[4495229] 

gi|4495227|emb|AJ24066 1 . 1 |SPN24066l 
[4495227] 

gi|4495225|erab|A J240660. 1 |SPN240660 
[4495225] 

gi|4495223|emb|AJ240659. 1 |SPN240659 
[4495223] 

gi|449522 1 |emb|AJ240658. 1 |SPN240658 
[4495221] 

gi|44952 1 9|emb|AJ240657. 1 |SPN240657 
[4495219] 

gi|44952 1 7|emb|AJ240656. 1 [SPN240656 
[4495217] 

gi|44952 15|emb|AJ240655. 1 |SPN240655 
[4495215] 

gi|4495213|emb|AJ240654.1|SPN240654 
[4495213] 

gi|449521 l|emb|AJ240653.1|SPN240653 
[4495211] 

gi|4495209|emb|AJ240652.1|SPN240652 
[4495209] 

gi|4495207|emb|AJ24065 1 .1 |SPN24065 1 
[4495207] 

gi|4495205|emb|AJ240650.1|SPN240650 
[4495205] 

gi|4495203|emb|AJ240649. 1 [SPN240649 
[4495203] 

gi|4495201|emb|AJ240648.1|SPN240648 
[4495201] 

gi|4495 1 99|emb| AJ240647. 1 |SPN240647 
[4495199] 

gi|4495 1 97|emb|AJ240644. 1 |SPN240644 
[4495197] 

gi|4495 1 95|emb|AJ240643. 1 |SPN240643 
[4495195] 

gi|4495 193|emb|AJ240642. 1 |SPN240642 
[4495193] 

gi|4495 1 9 1 |emb| AJ24064 1 . 1 |SPN24064 1 
[4495191] 



gi|4495 1 89|emb|AJ240640. 1 [SPN240640 
[4495189] 

gi|4495 1 87|emb|AJ240639 . 1 |SPN240639 
[4495187] 

gi|4495 1 85|emb|AJ240638. 1 [SPN240638 
[4495185] 

gi|4495 1 83|emb|AJ240637. 1 |SPN240637 
[4495183] 

gi|4495 1 8 1 |emb|AJ240636. 1 |SPN240636 
[4495181] 

gi|4495179|emb|AJ240635.1|SPN240635 
[4495179] 

gi|4495 1 77|emb| AJ240634. 1 |SPN240634 
[4495177] 

gi|4495 1 75|emb|AJ240633. 1 (SPN240633 
[4495175] 

gi|4495 1 73|emb|AJ240630. 1 |SPN240630 
[4495173] 

gi|4495 17 1 |emb| AJ240629. 1 |SPN240629 
[4495171] 

gi|4495 1 69|emb|AJ240628. 1 |SPN240628 
[4495169] 

gi|4495 1 67|emb|AJ240627. 1 |SPN240627 
[4495167] 

gi|4495165|emb|AJ240626.1|SPN240626 
[4495165] 

gi|4495 1 63|emblAJ240625. 1 [SPN240625 
[4495163] 

gi|4495 1 6 1 |emb|A J240624. 1 |SPN240624 
[4495161] 

gi|4495 1 59[emb|AJ240623. 1 |SPN240623 
[4495159] 

gi|4495 1 57|emb|AJ240622. 1 [SPN240622 
[4495157] 

gi|4495 1 55|emb|AJ24062 1 . 1 ]SPN24062 1 
[4495155] 

gi|4495 1 53|emb|AJ240620. 1 |SPN240620 
[4495153] 

gi|4495151|emb|AJ240619.1|SPN240619 
[4495151] 

gi|4495 149|emb|AJ2406 1 6. 1 |SPN2406 1 6 
[4495149] 

gi|4495147|emb|AJ240615.1|SPN240615 
[4495147] " 

gi|4495 1 45 |emb| A J2406 14.1 [SPN2406 1 4 - 
[4495145] 

gi|4495 143 |emb| A J2406 13.1 |SPN24061 3 
[4495143] 
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gi|4495 1 4 1 |emb| A J2406 12.1 |SPN2406 1 2 
[4495141] 

gi|4495 139|emb|AJ2406 11.1 |SPN24061 1 
[4495139] 

gi|4495 1 37|emb| A J2406 1 0. 1 |SPN2406 1 0 
[4495137] 

gi|4495 135|emb|AJ240609.1 |SPN240609 
[4495135] 

gi|4495 133|emb|AJ240608.1 [SPN240608 
[4495133] 

gi|4495 1 3 1 |emb| A J240607. 1 [SPN240607 
[4495131] 

gi|4495 129|emb|AJ240606.1 |SPN240606 
[4495129] 

gi|4883698|gb|AF079807.1|AF079807 [4883698] 

gi|4838562|gb|AF145055.1|AF145055 [4838562] 

gi|4063727|gb|L29324.1|STRINTE [4063727] 

gi|3093401|emb|AJ005619.1|SPAJ5619 [3093401] 

gi|4 1 03889|gb|AF029368. 1|AF029368 [4 1 03889] 

gi|2897689|dbj|D63805. 1 [D63805 [2897689] 

gi|4566771|gb[AFl 17741. 1|AF1 17741 [4566771] 

gi|4566768|gb|AF 1 1 7740. 1 1 AF 1 17740 [4566768] 

gi|4538836|emb|AJ240793.1|SPN240793 
[4538836] 

gi|4538832|emb|AJ240792.1|SPN240792 
[4538832] 

gi|4538828|emb| AJ24079 1 . 1 |SPN24079 1 
[4538828] 

gi|4538824|emb|AJ240790.1|SPN240790 
[4538824] 

gi|4538821|emb|AJ240789.1|SPN240789 
[4538821] 

gi|4538818|emb|AJ240788.1|SPN240788 
[4538818] 

gi|4538815|emb|AJ240787.1|SPN240787 
[4538815] 

gi|4538812|emb|AJ240786.1|SPN240786 
[4538812] 

gi|4538809|emb|AJ240785.1|SPN240785 
[4538809] 

gi|4538806|emb|AJ240784.1|SPN240784 
[4538806] 

gi|4538803|emb|AJ240783.1|SPN240783 
[4538803] 

gi|4538800|emb|AJ240782.1|SPN240782 
[4538800] 



gi|4538797|emb|AJ24078 1 . 1 JSPN24078 1 
[4538797] 

gi|4538794|emb|AJ240780.1|SPN240780 
[4538794] 

gi|453879 1 |emb|AJ240779. 1 |SPN240779 
[4538791] 

gi|4538788|emb|AJ240778.1|SPN240778 
[4538788] 

gi|4538785|emb|AJ240777.1|SPN240777 
[4538785] 

gi|4538782|emb| AJ240776. 1 [SPN240776 
[4538782] 

gi|4538779|emb|AJ240775.1|SPN240775 
[4538779] 

gi|4538776|emb|AJ240774.1|SPN240774 
[4538776] 

gi|4538773|emb|AJ240773.1|SPN240773 
[4538773] 

gi|4538770|emb|AJ240772.1|SPN240772 
[4538770] 

gi|4538767|emb|AJ24077 1 . 1 |SPN24077 1 
[4538767] 

gi|4538764|emb|AJ240770.1|SPN240770 
[4538764] 

gi|453876l |emb|AJ240769. 1 |SPN240769 
[4538761] 

gi|4538758|emb|AJ240768.1|SPN240768 
[4538758] 

gi|4538755femb|AJ240767.1|SPN240767 
[4538755] 

gi|4538752|emb|AJ240766.1|SPN240766 
[4538752] 

gi|4538749|emb|AJ240765. 1 |SPN240765 
[4538749] 

gi|4538746|emb|AJ24076 1 . 1 |SPN24076 1 
[4538746] 

gi|4538743|emb|AJ240760.1|SPN240760 
[4538743] 

gi|4538740|emb|AJ240759. 1 |SPN240759 
[4538740] 

gi|4538737|emb|AJ240758.1|SPN240758 
[4538737] 

gi|453 8734|emb| AJ240757. 1 |SPN240757 
[4538734] * - 

gi|453873 1 |emb[A J240756. 1 |SPN240756 
[4538731] 

gi|4538728|emb|AJ240755.1|SPN240755 
[4538728] 
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gi|4538725|emb| AJ240754. 1 |SPN240754 
[4538725] 

gi|4538722|emb|AJ240753.1|SPN240753 
[4538722] 

gi|4538719|emb|AJ240752.1|SPN240752 
[4538719] 

gi|45387 1 6|emb| AJ24075 1 . 1 |SPN24075 1 
[4538716] 

gi|4538713|emb|AJ240750.1|SPN240750 
[4538713] 

gi|45387 1 0|emb| AJ240749. 1 |SPN240749 
[4538710] 

gi|4538707|emb|AJ240748.1|SPN240748 
[4538707] 

gi|4538704|emb|AJ240747. 1 (SPN240747 
[4538704] 

gi|453870 1 |emb| AJ240746. 1 |SPN240746 
[4538701] 

gi|4538698|emb| AJ240745. 1 |SPN240745 
[4538698] 

gi|4538695temb|AJ240744.1 [SPN240744 
[4538695] 

gi|4538692|emb|AJ240743. 1 |SPN240743 
[4538692] 

gi|4538689|emb|AJ240742. 1 |SPN240742 
[4538689] 

gi|4538686|emb|AJ24074 1 . 1 |SPN24074 1 
[4538686] 

gi|4538683[emb|AJ240740. 1 [SPN240740 
[4538683] 

gi|4538680|emb|A J240739. 1 |SPN240739 
[4538680] 

gi|4538677|emb|AJ240738. 1 [SPN240738 
[4538677] 

gi|4530444|gb|AFl 18229.1|AF1 18229 [4530444] 
gi|45 1 9253 |dbj| ABO 15852.1 (ABO 1 5852 [4519253] 
gi|45 1 925 1 |dbj| ABO 15851.1 |ABO 1 585 1 [45 1 925 1 ] 
gi[4519249|dbj|AB015850.1|AB015850 [4519249] 
gi|4519247|dbj|AB015849.i|AB015849 [4519247] 
gi|4519245|dbj|AB0I5848.1|AB015848 [4519245] 
gi|45 19243|dbj|AB0 1 5847. 1 |ABO 1 5847 [4519243] 
gi|4519241|dbj|AB015846.1|AB015846 [4519241] 
gi|45 1 9239|dbj | ABO 11210.1 (ABO 1 1 2 1 0 [4 5 1 9239 ] 
gi|4519237|dbj|AB01 1209.1|AB01 1209 [4519237] 
gi|4519235|dbj|AB01 1208.1|AB01 1208 [4519235] 



gi|45 1 9233|dbj|AB0 1 1 207. 1 1 ABO 1 1 207 [4519233] 

gi|45 1 923 1 |dbj|ABO 1 1 206. 1 |AB0 1 1 206 [45 1 923 1 ] 

gi|4519229|dbj|AB011205.1|AB011205 [4519229} 

gi|4519227|dbj|AB01 1204. 1|AB01 1204 [4519227] 

gi(4 5 1 9225 (dbj | AB 0 1 1 203 . 1 1 ABO 1 1 203 [4519225] 

gi|45 1 9223[dbj| ABO 1 1 202. 1 1 ABO 1 1 202 [4519223] 

gi|45 1 922 1 |dbj| ABO 1 1 20 1 . 1 1 ABO 11201 [4519221] 

gi|4519219[dbj|AB011200.1|AB011200 [4519219] 

gi|45 1 92 17|dbj | ABO 1 1 1 99 . 1 (ABO 1 1 1 99 [4519217] 

gi|45 1 92 1 5 |dbj| ABO 1 1 198.1|AB0l 1 198 [4519215] 

gi|4495 1 27|cmb|AJ240605. 1 |SPN240605 
[4495127] 

gi(4468031|emb|AJ132957.1|SPN132957 
[4468031] 

gi|4468029|emb|AJ132956.1|SPN132956 
[4468029] 

gi|4218532|emb|AJ010312.1|SPN010312 
[4218532] 

gi|4456852|emb|AJ236792.1|SPN236792 
[4456852] 

gi|4456850|emb|AJ23679 1 . 1 |SPN23679 1 
[4456850] 

gi|44568481emb|AJ236790.1|SPN236790 
[4456848] 

gi|4456846|emb|AJ236789. 1 (SPN236789 
[4456846] 

gi|3550644temb|AJ006987.1|SPAJ6987 [3550644] 

gi|3550625|emb|AJ006986.1|SPAJ6986 [3550625] 

gi|4416518|gb|AF014458.2|AF014458 [4416518] 

gi|4406260|gb|AF105 1 16. 1 (AF105 1 16 [4406260] 

gi|4406257|gb|AF 105 1 15. 1 |AF105 1 15 [4406257] 

gi|4406254|gb| AF 1 05 1 1 4. 1 ( AF 1 05 1 14 [4406254] 

gi|4406246|gb|AF1051 13.1|AF1051 13 [4406246] 

gi|4406243 |gb| AF 1 05 11 2. 1 1 AF 1 05 1 1 2 [4406243 ] 

gi|4 1 38533|emb|AJ0058 15.1 |SPN58 1 5 [4 1 38533] 

gi|3821726|emb|AJ232433. 1 |SPN232433 
[3821726] 

gi|3821724|emb|AJ232432.1|SPN232432 
[3821724] 

gi|3821722|emb|AJ232431.1|SPN232434-~ ~ 
[3821722] 

gi|382 1 720|emb|AJ232430. 1 |SPN232430 
[3821720] 
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gi|3821718|emb|AJ232429.1|SPN232429 
[3821718] 

gi|382 1 7 1 6|emb|AJ232428. 1 |SPN232428 
[3821716] 

gi|3821714|emb|AJ232427.1|SPN232427 
[3821714] 

gi|3821712|emb|AJ232426.1|SPN232426 
[3821712] 

gi|382 1 7 1 0|emb| AJ232425 . 1 |SPN232425 
[3821710] 

gi|3821708|emb|AJ232424.1|SPN232424 
[3821708] 

gi|3821706|embIAJ232423.1|SPN232423 
[3821706] 

gi|382 1 704|emb|AJ232422. 1 |SPN232422 
[3821704] 

gi|382 1 702|emb|AJ23242 1 . 1 [SPN23242 1 
[3821702] 

gi|3821700|emb|AJ232420.1|SPN232420 
[3821700] 

gi|3821698|emb|AJ232419.1|SPN232419 
[3821698] 

gi|382 1 696|emb|AJ2324 18.1 |SPN2324 1 8 
[3821696] 

gi|382l694|emb|AJ232417.1|SPN232417 
[3821694] 

gi|3821692|emb|AJ232416.1|SPN232416 
[3821692] 

gi|3 82 1 690|emb| A J23 24 1 5 . 1 |SPN2324 1 5 
[3821690] 

gi|382 1 688|emb|AJ2324 14. 1 |SPN2324 14 
[3821688] 

gi|3821686|embtAJ232413.1|SPN232413 
[3821686] 

gi|3821684|emb|AJ232412.1|SPN232412 
[3821684] 

gi|382 1 682|emb|AJ2324 11.1 |SPN2324 1 1 
[3821682] 

gi|3821680|emb|AJ232410.1[SPN232410 
[3821680] 

gi|382 1 678|emb|AJ232409. 1 (SPN232409 
[3821678] 

gi|3821676|emb|AJ232408.1|SPN232408 
[3821676] 

gi|3821674|emb|AJ232407.1|SPN232407 
[3821674] 

gi|3 82 1 672|emb[A J232406. 1 (SPN232406 
[3821672] 



gi|3821670|emb|AJ232405.1|SPN232405 
[3821670] 

gi|3821668|emb|AJ232404.1|SPN232404 
[3821668] 

gi|3821 666|emb)AJ232403. 1 ISPN232403 
[3821666] 

gi|3821664|emb|AJ232402.1tSPN232402 
[3821664] 

gi|382 1 662|emb| AJ23240 1 . 1 ISPN23240 1 
[3821662] 

gi|3821 660|emb|AJ232399. 1 |SPN232399 
[3821660] 

gi|3821658|emb|AJ232398. 1 [SPN232398 
[3821658] 

gi|382 1 656|emb|A J232397. 1 |SPN232397 
[3821656] 

gi|3821 654|emb[A J232396. 1 [SPN232396 
[3821654] 

gi|3821652|emb|AJ232395.1|SPN232395 
[3821652] 

gi|3821650|emb|AJ232394.1|SPN232394 
[3821650] 

gi|3821 648|emb|AJ232393. 1 |SPN232393 
[3821648] 

gi|3821646|emb|AJ232392.1|SPN232392 
[3821646] 

gi|382 1 644|emb|AJ23239 1 . 1 |SPN23239 1 
[3821644] 

gi|382 1 642|emb|AJ232390. 1 [SPN232390 
[3821642] 

gi|382 1 640|emb|AJ232389. 1 |SPN232389 
[3821640] 

gi|3821638|emb|AJ232388.1|SPN232388 
[3821638] 

gi|3821636|emb|AJ232387.1|SPN232387 
[3821636] 

gi|382 1 634|emb|AJ232386. 1 |SPN232386 
[3821634] 

gi|3821632|emb|AJ232385.1|SPN232385 
[3821632] 

gi|3821630|emb|AJ232384.1|SPN232384 
[3821630] 

gi|3821628|emb|AJ232383.1|SPN232383 
[3821628] " * 1- 

gi|3821626|emb|AJ232382.1|SPN232382 
[3821626] 

gi|382 1 624|emb|AJ23238 1 . 1 (SPN23238 1 
[3821624] 
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gi|3821622iemb|AJ232380.1|SPN232380 
[3821622] 

gii382 i 620|emb|AJ232379. 1 [SPN232379 
[3821620] 

gi|3821618|emb|AJ232378.1|SPN232378 
[3821618] 

gi|382 1 6 16|emb|AJ232377. 1 |SPN232377 
[3821616] 

gi|3821614|emb[AJ232376.1|SPN232376 
[3821614] 

gi|3821612|emb|AJ232375.1|SPN232375 
[3821612] 

gi|3821610|emb|AJ232373.1|SPN232373 
[3821610] 

gi|3821608|emb|AJ232372.1|SPN232372 
[3821608] 

gi|382 1 606|emb|AJ23237 1 . 1 |SPN23237 1 
[3821606] 

gi|3821604|emb|AJ232370.1|SPN232370 
[3821604] 

gi|3 82 1 602|emb| AJ232369. 1 [SPN232369 
[3821602] 

gi|3821600|emb|AJ232368.1|SPN232368 
[3821600] 

gi|3821598|emb|AJ232367.1|SPN232367 
[3821598] 

gi|3821596|emblAJ232366.1!SPN232366 
[3821596] 

gi|3821594|emb|AJ232365.1|SPN232365 
[3821594] 

gi|3820454|emb|AJ007367. 1 |SPN7367 [3820454] 

gi|3821592|emb|AJ232364.1|SPN232364 

[3821592] 

gi|382 1 590|emb|AJ232363. 1 |SPN232363 
[3821590] 

gi|3821588|emb|AJ232362.1|SPN232362 
[3821588] 

gi|3 821 586|emb|AJ23236 1 . 1 |SPN23236 1 
[3821586] 

gi|3821584|emb|AJ232360.1|SPN232360 
[3821584] 

gi|3821582|emb|AJ232359.1|SPN232359 
[3821582] 

gi|3821580|emb|AJ232358.1|SPN232358 
[3821580] 

gi|3821578|emb|AJ232357.1|SPN232357 
[3821578] 



gi|3 82 1 576|emb|AJ23235 6. 1 |SPN232356 
[3821576] 

gi|382 1 574|emb|AJ232355. 1 |SPN232355 
[3821574] 

gi|382 1 572|emb!AJ232353. 1 [SPN232353 
[3821572] 

gi|382 1 570|emb| AJ232352. 1 ISPN232352 
[3821570] 

gi|382 1 568|emb| AJ23235 1 . 1 |SPN23235 1 
[3821568] 

gi|382 1 566|emb|AJ2323 50. 1 |SPN232350 
[3821566] 

gi|3821564|emb|AJ232349.1|SPN232349 
[3821564] 

gi|3821562|emb|AJ232348.1|SPN232348 
[3821562] 

gi|3821560|emb|AJ232347.1|SPN232347 
[3821560] 

gi|3821558|emb|AJ232346.1|SPN232346 
[3821558] 

gi|3821556|emb|AJ232345.1|SPN232345 
[3821556] 

gi|382 1 554|emb|AJ232344.1 (SPN232344 
[3821554] 

gi|3821552|emb|AJ232343.1|SPN232343 
[3821552] 

gi|3821550|emb|AJ232342.1|SPN232342 
[3821550] 

giI3821548|emb|AJ232341.1|SPN232341 
[3821548] 

gi|382 1 546|emb|A J232340. 1 |SPN232340 
[3821546] 

gi|3 82 1 544|emb|AJ232339. 1 |SPN232339 
[3821544] 

gi|3821542|emb|AJ232338.1|SPN232338 
[3821542] 

gi|382 1 540|emb|AJ232337. 1 |SPN232337 
[3821540] 

gi|3 82 1 538|emb|AJ232336. 1 |SPN232336 
[3821538] 

gi|382 1 536|emb|AJ232335. 1 |SPN232335 
[3821536] 

gi|3821534Jemb|AJ232334.1|SPN232334 
[3821534] ' 

gi|3 82 1 532(emb| A J232333 . 1 |SPN2l2333 
[3821532] 

gi|382 1 530|emb|AJ232332. 1 [SPN232332 
[3821530] 
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gi|382 1 528|emb] A J23233 1 . 1 |SPN23233 1 
[3821528] 

gi|382 1526|emb] AJ232330. 1 |SPN232330 
[3821526] 

gi|3 82 1 524|emb|AJ232329. 1 (SPN232329 
[3821524] 

gi|3821522|emb|AJ232328.1|SPN232328 
[3821522] 

gi|3821520|emb]AJ232327.1|SPN232327 
[3821520] 

gi|3 82 1 5 1 8|emb| AJ232326. 1 |SPN232326 
[3821518] 

gi|382 1 5 1 6|emb| AJ232325. 1 |SPN232325 
[3821516] 

gi|3821514|emb|AJ232324.1|SPN232324 
[3821514] 

gi|3821512|emb|AJ232322.1|SPN232322 
[3821512] 

gi|3821510|emb|AJ232321.1|SPN232321 
[3821510] 

gi|3821508|emb|AJ232320.1|SPN232320 
[3821508] 

gi|3821506|emb|AJ232319.1|SPN232319 
[3821506] 

gi|382 1 504|emb|AJ2323 1 8. 1 [SPN2323 1 8 
[3821504] 

gi|3821502|emb|AJ232317.1|SPN232317 
[3821502] 

gi|3821500|emb|AJ232316.1|SPN232316 
[3821500] 

gi|3821498|emb| AJ2323 15.1 |SPN2323 1 5 
[3821498] 

gi|3821496|emb|AJ232314.1|SPN232314 
[3821496] 

gi|3821494|emb|AJ2323 1 3. 1 [SPN2323 13 
[3821494] 

gi|3 82 1492|emb| AJ2323 1 2. 1 |SPN2323 1 2 
[3821492] 

gi|3 82 1 490|emb| AJ2323 11.1 [SPN2323 1 1 
[3821490] 

gi|3821488|emb|AJ232310.1|SPN232310 
[3821488] 

gi|3821486|emb|AJ232309.1|SPN232309 
[3821486] 

gi|3821484|emb|AJ232308.1|SPN232308 
[3821484] 

gi|382 1482|emb|AJ232307. 1 |SPN232307 
[3821482] 



gi|3821480|emb|AJ232306.1ISPN232306 
[3821480] 

gi|3821478|emb|AJ232305.1|SPN232305 
[3821478] 

gi|3821476|emb|AJ232304.1|SPN232304 
[3821476] 

gi|382 1474|emb|A J232303 . 1 |SPN232303 
[3821474] 

gi|3821472|emb|AJ232302.1|SPN232302 
[3821472] 

gi)3821470|emb|AJ232301.1|SPN232301 
[3821470] 

gi|3821468|emb|AJ232300.1|SPN232300 
[3821468] 

gi|3821466|ernb|AJ232299.1|SPN232299 
[3821466] 

gi|3821464|emb[AJ232298.1|SPN232298 
[3821464] 

gi|3821462|emb|AJ232297.1|SPN232297 
[3821462] 

gi|382 1460|emb| A J232295 . 1 |SPN232295 
[3821460] 

gi|3821458|emb|AJ232294.1|SPN232294 
[3821458] 

gi|3821456|emb|AJ232293.1|SPN232293 
[3821456] 

gi|3821454|emb|AJ232292.1|SPN232292 
[3821454] 

gi|382 1452|emb|A J23229 1 . 1 |SPN23229 1 
[3821452] 

gi|382 1450|emb[AJ232290. 1 [SPN232290 
[3821450] 

gi|3821448|emb|AJ232289.1|SPN232289 
[3821448] 

gi|3821446|emb|AJ232288.1|SPN232288 
[3821446] 

gi|3821444|emb|AJ232287.1|SPN232287 
[3821444] 

gi|3821442|emb|AJ232286.1|SPN232286 
[3821442] 

gi|3821440|emb|AJ232285.1|SPN232285 
[3821440] 

gi|3821438|emb|AJ232284.1|SPN232284 
[3821438] * " 

gi|3821436|emb|AJ232283.1|SPN232283 
[3821436] 

gi|3821434|erab|AJ232282. 1 |SPN232282 
[3821434] 



WO 00/32825 



PCT/IB99/02040 



gi|382 l432|emb|AJ23228 1. 1 |SPN23228 1 
[3821432] 

gi|3821430|emb|AJ232280.1|SPN232280 
[3821430] 

gi|3821428|emb|AJ232279.1|SPN232279 
[3821428] 

gi|3821426|emb|AJ232278.1|SPN232278 
[3821426] 

gi|3821424|emb|AJ232276.1|SPN232276 
[3821424] 

gi|3821422|emb|AJ232275.1|SPN232275 
[3821422] 

gi|3821420|emb|AJ232274.1|SPN232274 
[3821420] 

gi|3821418|emb[AJ232273.1|SPN232273 
[3821418] 

gi|3 82141 6|emb| AJ232272. 1 |SPN232272 
[3821416] 

gi|3821414|emb|AJ232271.1|SPN232271 
[3821414] 

gi|3821412|emb|AJ232270.1|SPN232270 
[3821412] 

gi|3821410|emb|AJ232269.1|SPN232269 
[3821410] 

gi|3821408|emb|AJ232268.1|SPN232268 
[3821408] 

gi|3821406|emb|AJ232267.1|SPN232267 
[3821406] 

gi|382 1404|emb|AJ232266. 1 [SPN232266 
[3821404] 

gi|3821402|emb|AJ232265.1|SPN232265 
[3821402] 

gi|3821400|emb|AJ232264.1|SPN232264 
[3821400] 

gi|3821398|emb|AJ232263.1|SPN232263 
[3821398] 

gi|3821396|emb|AJ232262.1|SPN232262 
[3821396] 

gi|3821394|emb|AJ232261.1|SPN232261 
[3821394] 

gi|3821392|emb|AJ232260.1|SPN232260 
[3821392] 

gi|3821390|emb|AJ232259.1|SPN232259 
[3821390] 

gi|3821388|emb(AJ232258.1|SPN232258 
[3821388] 

giI3821386|emb|AJ232257.l|SPN232257 
[3821386] 



gi|382 1384|emb|AJ232256. 1 |SPN232256 
[3821384] 

gi|3821382|emb|AJ232255.1|SPN232255 
[3821382] 

gi|3821380|emb(AJ232254.1|SPN232254 
[3821380] 

gi|3 82 1378|emb|AJ232253. 1 [SPN232253 
[3821378] 

gi|3821376|emblAJ232252.1|SPN232252 
[3821376] 

gi|382 1 374|emb|A J23225 1 . 1 |SPN23225 1 
[3821374] 

gi|3821372|emb|AJ232250.1|SPN232250 
[3821372] 

gi|382 1370|emb|AJ232249. 1 ISPN232249 
[3821370] 

gi|3821367|emb|AJ232248.1|SPN232248 
[3821367] 

gi|3821365|emb|AJ232247.1|SPN232247 
[3821365] 

gi|382 1363|emb|AJ232246. 1 [SPN232246 
[3821363] 

gi|3821361|emb|AJ232245.1|SPN232245 
[3821361] 

gi|3821359|emb|AJ232244.1|SPN232244 
[3821359] 

gi|3821357|emb|AJ232243.1|SPN232243 
[3821357] 

gi|3821355|emb|AJ23224 1 .1 |SPN232241 
[3821355] 

gi|292 1 842|gb|AF047385 . 1 |AF047385 [292 1 842] 

gi|2909863|gb|AF047696.1|AF047696 [2909863] 

gi|4193353|gb|AF055088.1|AF055088 [4193353] 

gi|4 1 85242|gb|AH007276. 1 |SEG_SPTNJUNC 
[4185242] 

gi|4 1 8524 1 |gb| AF066797. 1 [SPTNJUNC2 
[4185241] 

gi|4 1 85240|gb| AF066796. 1 |SPTNJUNC 1 
[4185240] 

gi|4097979|gb|U72655. 1 |SPU72655 [4097979] 
gi|4063720|gb|L29323.1|STRMTR [4063720] 
gi|1657605|gb|U66846.1|SPU66846 [1657605] 
gi|1657602|gb|U66845.1|SPU668454lS57602] 
gi|4009485|gb[AF068903. 1|AF068903 [4009485] 
gi|4009477|gb|AF068902. 1 |AF068902 [4009477] 
gi|4009462|gb|AF068901 . 1 |AF068901 [4009462] 
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gi|3947767|emb|AJ233896.1|SPN233896 
[3947767] 

gi|3947765|emb|AJ233895.1|SPN233895 
[3947765] 

gi|3947763|emb|AJ233894.1|SPN233894 
[3947763] 

gi|3947761|emb|AJ233893.1|SPN233893 
[3947761] 

gi|3947759|emb|AJ233892.1|SPN233892 
[3947759] 

gi|3947757|emb| AJ23389 1 . 1 [SPN23389 1 
[3947757] 

gi|3947755|emb|AJ233890.1|SPN233890 
[3947755] 

gi|3947753|emb|AJ233889.1|SPN233889 
[3947753] 

gii3947751|emb|AJ233888.1|SPN233888 
[3947751] 

gi|3947749|emb|AJ233887.1|SPN233887 
[3947749] 

gi|3947730|emb|AJ233886.1|SPN233886 
[3947730] 

gi|3758891|emb|Z71552.1|SPADCA [3758891] 

gi|38 1 8479|gb| AF057294. 1 |AF057294 [38 1 8479] 

gi|2351767|gb|U89711.1|SPU89711 [2351767] 

gi|3395661 |dbj|AB006879. 1 |AB006879 [339566 1 ] 

gi|3395659|dbj|ABOO6878.1|AB006878 [3395659] 

gi|3395657|dbj|AB006877.1|AB006877 [3395657] 

gi|3395655|dbj|AB006876. 1 |AB006876 [3395655] 

gi|3395653|dbj|AB006875.1(AB006875 [3395653] 

gi|3395651|dbj|AB006874.1|AB006874 [3395651] 

gi|3395649|dbj|AB006873.1|AB006873 [3395649] 

gi|3395647|dbj|AB006872.1|AB006872 [3395647] 

gi|3395645|dbj|AB00687 1 . 1 [AB00687 1 [3395645] 

gi|3395643|dbj|AB006870.1 [AB006870 [3395643] 

gi|3395641|dbj|AB006869.1|AB006869 [3395641] 

gi|3395639|dbj| AB006868. 1 |AB006868 [3395639] 

gi|23 1 5992|gb|U87092. 1 |SPU87092 [23 1 5992] 

gi|2209338|gb|U93576.1|SPU93576 [2209338] 

gi|2 109442|gb|AF000658. 1 |SPDNAARG 
[2109442] 

gi| 1 88 1 538|gb(U09239. 1 |SPU09239 [1881538] 
gi| 1666904|gb|U762 1 8. 1 (SPU762 1 8 [ 1 666904] 
gi| 1 6 1 3766|gb|U333 15.1 [SPU333 1 5 [ 1 6 1 3766] 



gi| 1498294|gb|U4 1735. 1 |SPU4 1 735 [ 1 498294] 
gi| 1 2 1 3493|gb|U47687. 1 |SPU47687 [121 3493] 
gi| 1 1 63 1 09|gb|U43526. 1 |SPU43526 [ 1 1 63 1 09] 
gi|556001|gb|U15171.1|SPU15171 [556001] 
gi|455063|gb|U02920.1|SPU02920 [455063] 
gi|784896|gb|L36923.1|STRSTRH [784896] 
gi|3320386|gb|AF030373.1|AF030373 [3320386] 
gi|2804772|gb|AF030374. 1 1 AF030374 [2804772] 
gi|2804762|gb|AF030372.1|AF030372 [2804762] 
gi|2804756|gb|AF030371.1|AF030371 [2804756] 
gi|2804750|gb|AF030370. 1 1 AF030370 [2804750] 
gi|2804745|gb|AF030369. 1 |AF030369 [2804745] 
gi|2804739|gb|AF030368.1|AF030368 [2804739] 
gi|2804732|gblAF030367.1|AF030367 [2804732] 
gi|2804726|gb|AF030366.1|AF030366 [2804726] 
gi|2804720]gb|AF030365.1|AF030365 [2804720] 
gi|2804713|gb|AF030364.1|AF030364 [2804713] 
gi|2804707|gb|AF030363.1|AF030363 [2804707] 
gi|2804701 |gb|AF030362. 1 [AF030362 [2804701] 
gi|2804694|gb|AF030361 . 1 |AF030361 [2804694] 
gi|2804688|gblAF030360.1|AF030360 [2804688] 
gi[2804682|gb|AFO30359. 1|AF030359 [2804682] 
gi|3550979|dbj|AB010387.1|AB010387 [3550979] 
gi|2275100|emb|AJ000336.1|SPR6LDH [2275100] 
gi|355 1 853 |gb| AF076029. 1 1 AF076029 [355 1 853] 
gi|355 1 773|gb|U94770. 1 |SPU94770 [355 1773] 
gi|3550617|emb|AJ004869.1|SPAJ4869 [3550617] 
gi|35 13563|gb|AF055727. 1 |AF055727 [35 13563] 
gi|35 1 3561 |gb|AF055726. 1 |AF055726 [3513561] 
gi|35 13559|gb[AF055725. 1 1 AF055725 [3513559] 
gi|3513557|gb|AF055724.1|AF055724 [3513557] 
gi|3513555|gb|AF055723.1|AF055723 [3513555] 
gi|35 13553|gb|AF055722. 1 |AF055722 [35 13553] 
gi|3513549|gb|AF055721.1|AF055721 [3513549] 
gi|35 1 3545|gb| AF055720. 1 (AF055720 [35 1 3545] 
gi|1914869|emb|Z82001.1|SPZ82001 [-1914869] , 
gi|2911421|gb|AF046238.1|AF046238 [2911421] 
gi|2911419|gb|AF046237.1|AF046237 [2911419] 
gi|2911417|gb|AF046236.1|AF046236 [2911417] 
gi|2911415|gb|AF046235.1IAF046235 [2911415] 
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gi|29 1 1 4 1 3|gb|AF046234. 1 )AF046234 [291 14 13] 
gi|291 1411|gb|AF046233.1|AF046233 [291 141 1] 
gi|291 1409|gb|AF046232.1|AF046232 [291 1409] 
gi|29 1 1 407|gb|AF04623 1 . 1 1 AF04623 1 [2911407] 
gi|291 1405|gb|AF046230.1|AF046230 [291 1405] 
gi|3258601|gb|U40786.1|SPU40786 [3258601] 
gi|32 1 1 756|gb(AF052209. 1 |AF052209 [321 1756] 
gi|3211752|gb|AF052208.1|AF052208 [3211752] 
gi|32 1 1 747|gb|AF052207. 1 (AF052207 [32 1 1 747] 
gi|3220 1 94|gb|AF053 121.1 (AF053 1 2 1 [3220 1 94] 
gi|2766052|emb|Z99863.1|SPZ99863 [2766052] 
gi|2766050|emb|Z99862.1|SPZ99862 [2766050] 
gi|2766048|emb|Z99861.1|SPZ99861 [2766048] 
gi|2766046|emb|Z99860.1|SPZ99860 [2766046] 
gi|2766044|emb|Z99859. 1 (SPZ99859 [2766044] 
gi|2766042|emb|Z99858.1|SPZ99858 [2766042] 
gi|2766040|emb|Z99857. 1 |SPZ99857 [2766040] 
gi|2766038|emb|Z99856.1|SPZ99856 [2766038] 
gi|2766036|emb|Z99855.1|SPZ99855 [2766036] 
gi|2766034|emb|Z99854. 1 |SPZ99854 [2766034] 
gi|2766032|emb|Z99853.1|SPZ99853 [2766032] 
gi|2766030|emb|Z99852.1|SPZ99852 [2766030] 
gi|2766028|emb|Z99851.1|SPZ99851 [2766028] 
gi|2766026|emb|Z99850.1|SPZ99850 [2766026] 
gi|2766024|emb|Z99849.1|SPZ99849 [2766024] 
gi|2766022|emb|Z99848.1|SPZ99848 [2766022] 
gi|2766020|emb|Z99847. 1 [SPZ99847 [2766020] 
gi|27660 1 8|emb|Z99846. 1 |SPZ99846 [27660 1 8] 
gi|2766016|emb|Z99845.1|SPZ99845 [2766016] 
gi|2766014|emb|Z99844.1|SPZ99844 [2766014] 
gi|27660 1 2|emb|Z99843 . 1 |SPZ9 9843 [27660 1 2] 
gi|2766010|emb|Z99842.1|SPZ99842 [2766010] 
gi|2766008|emb|Z9984 1 . 1 [SPZ9984 1 [2766008] 
gi|2766006|emb|Z99840. 1 |SPZ99840 [2766006] 
gi|2766004|emb|Z99839. 1 |SPZ99839 [2766004] 
gi|2766002|emb|Z99838. 1|SPZ99838 [2766002] 
gi|2766000|emb|Z99837. 1 |SPZ99837 [2766000] 
gi|2765998|emb|Z99828.1|SPZ99828 [2765998] 
gi|2765996|emb|Z99827. 1 [SPZ99827 [2765996] 
gi|2765994|emb|Z99826. 1 |SPZ99826 [2765994] 



gi|2765992|emb|Z99825. 1 (SPZ99825 [2765992] 

gi|2765990|emb|Z99824.1|SPZ99824 [2765990] 

gi|2765988|emb|Z99823.1|SPZ99823 [2765988] 

gi|2765986|emb|Z99822. 1 |SPZ99822 [2765986] 

gi|2765984|emb|Z9982 1 . 1 |SPZ9982 1 [2765984] 

gi|2765982|emb|Z99820. 1 |SPZ99820 [2765982] 

gi|2765980|emb|Z998 19. 1 [SPZ998 19 [2765980] 

gi|2765978|emb|Z998 18. 1 |SPZ998 18 [2765978 J 

gi|2765976|emb|Z998 17.1 |SPZ998 1 7 [2765976] 

gil2765974|emb|Z99816.1|SPZ99816 [2765974] 

gi|2765972|emb|Z998 1 5. 1 |SPZ998 1 5 [2765972] 

gi|2765970|emb|Z99814.1|SPZ99814 [2765970] 

gi|2765968|emb|Z99813.1|SPZ99813 [2765968] 

gi|2765966|emb|Z998 12. 1 (SPZ998 12 [2765966] 

gi[2765964|emb|Z998 11.1 |SPZ998 1 1 [2765964] 

gi|2765962|emb|Z99810.1|SPZ99810 [2765962] 

gi|2765960|emb|Z99809.1|SPZ99809 [2765960] 

gi|2765958|emb|Z99808.1|SPZ99808 [2765958] 

gi|2765956|emb|Z99807. 1|SPZ99807 [2765956] 

gi|2765954|emb|Z99806.1iSPZ99806 [2765954] 

gi|2765952|emb|Z99805.1|SPZ99805 [2765952] 

gi|2765950|emb|Z99804. 1 |SPZ99804 [2765950] 

gi|2765948|emb|Z99803. 1|SPZ99803 [2765948] 

gi|2894 1 04|emb|X77249. 1 |SPR6CIARH [2894 1 04] 

gi|3 1 53897|gb|AF067 1 28. 1 |AF067 1 28 [3 1 53897] 

gi|3 1 527 12|gb|AF065 1 53. 1 1 AF065 1 53 [31527 1 2] 

gi|3 1 527 1 0|gb| AF065 1 52. 1 |AF065 1 52 [3 1 527 1 0] 

gi|3 1 52708)gb| AF065 151.1 |AF065 1 5 1 [3 1 52708] 

gi|3116426|gb|U84387.1|SPU84387 [31 16426] 

gi|2385403|emblAJ0O1247.1|SP7465RR3 
[2385ft03] 

gi|2342540|emb|AJ001250.1|SP7978RR5 
[2342540] 

gi|2342539|emb(AJ00 125 1 . 1 [SP7978RR3 
[2342539] 

gi|2342538|emb|AJ001248.1tSP7466RR5 
[2342538} 

gi|2342537|emb|AJ001249.1|SP7466RJR3- " 
[2342537] 

gi|3065896|gb|AF058920. 1 |AF058920 [3065896] 
gi|2982647|emb|AJ002294. 1 |SPAJ2294 [2982647] 
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gi|2982645|emb|AJ002293.1|SPAJ2293 [2982645] 
gi|2982643|emb|AJ002292.1|SPAJ2292 [2982643] 
gi|298264 1 |emb| A J00229 1 . 1 (SPAJ229 1 [298264 i ] 
gi| 1 620466|emb|X99400. 1 |SPDACAO [ 1 620466] 
gi|2196665|emb|Z84381.1|HSZ84381 [2196665] 
gi|2196663|emb|Z84380.1|HSZ84380 [2196663] 
gi|2196661|emb|Z84379.1|HSZ84379 [2196661] 
gi|2196659|emb|Z84378.1|HSZ84378 [2196659] 
gi| 625 1 75[gb|L3 6131.1 |STREXP 1 OA [625 1 75] 
gi|3004945|gb|AF036624. 1 (AF036624 [3004945] 
gi|3004943|gb|AF036623.1|AF036623 [3004943] 
gi|3004941tgb|AF036622.1|AF036622 [3004941] 
gi|3004939|gb| AF03662 1 . 1 1 AF036621 [3004939] 
gi|3004937|gb|AF036620. 1 |AF036620 [3004937] 
gi|3004935|gb|AF036619.1|AF036619 [3004935] 
gi|2370572|emb|Z861 12.1|SPZ861 12 [2370572] 
gi|2765946|emb|Z99802. 1|SPZ99802 [2765946] 
gi|2398824|emb|Z34303.1|SPCINREC [2398824] 
gi|28945 1 2|emb|AJ22349 i . 1 |SPPPR3 [28945 1 2] 
gi|2198539|emb|X85787.1|SPCPS14E [2198539] 
gi|2766156|emb|Z99915.1|SPZ999 15 [2766156] 
gi|2766154|emb[Z99914.1|SPZ99914 [2766154] 
gi|2766152|emb|Z99913.1|SPZ99913 [2766152] 
gi|2766150|emb|Z99912.1|SPZ99912 [2766150] 
gi|2766 148|emb|Z999 11.1 |SPZ999 1 1 [2766 148] 
gi|2766146|emb|Z99910.1|SPZ99910 [2766146] 
gi|2766144|emb|Z99909.1|SPZ99909 [2766144] 
gi|2766142|emb|Z99908.1|SPZ99908 [2766142] 
gi|2766140Iemb|Z99907.1|SPZ99907 [2766140] 
gi|2766138|emb|Z99906.1|SPZ99906 [2766138] 
gi|2766136|emb|Z99905.1|SPZ99905 [2766136] 
gi|2766134|emb|Z99904.1|SPZ99904 [2766134] 
gi|2766132|emb|Z99903.1|SPZ99903 [2766132] 
gi|2766130|emb|Z99902.1|SPZ99902 [2766130] 
gi|2766128|emb|Z99901.1|SPZ99901 [2766128] 
gi|2766126|emb|Z99900.1|SPZ99900 [2766126] 
gi|2766124|emb|Z99899.1|SPZ99899 [2766124] 
gi|2766 1 22|emb|Z99898. 1 |SPZ99898 [2766 1 22] 
gi|2766120|emb|Z99897.1|SPZ99897 [2766120] 
gi|27661 18|emb|Z99896.1|SPZ99896 [27661 18] 



gi|27661 16|emb|Z99895.1|SPZ99895 [27661 16] 
gi|27661 14|emb|Z99894.1|SPZ99894 [27661 14] 
gi|27661 12|emb|Z99893.1|SPZ99893 [2766112] 
gi|27661 10|emb|Z99892.1|SPZ99892 [27661 10] 
gi|2766108|emb|Z99891.1|SPZ99891 [2766108] 
gi|2766106|emb|Z99890.1iSPZ99890 [2766106] 
gi|2766104|emb|Z99889. 1 |SPZ99889 [27661 04] 
gi|2766102|emb|Z99888.1|SPZ99888 [2766102] 
gi|2766100|emb|Z99887.1|SPZ99887 [2766100] 
gi|2766098|emb|Z99886. 1|SPZ99886 [2766098] 
gi|2766096|emb|Z99885. ljSPZ99885 [2766096] 
gi|2766094|emb|Z99884. 1 |SPZ99884 [2766094] 
gi|2766092|emb|Z99883.1|SPZ99883 [2766092} 
gi|2766090|emb|Z99882.1|SPZ99882 [2766090] 
gi|2766088|emb|Z9988 1 . 1 |SPZ9988 1 [2766088] 
gi|2766086|emb|Z99880.1|SPZ99880 [2766086] 
gi|2766084|emb|Z99879. 1|SPZ99879 [2766084] 
gi|2766082|cmb|Z99878.1|SPZ99878 [2766082] 
gi|2766080|emb|Z99877.1|SPZ99877 [2766080] 
gi|2766078|emb|Z99876.1|SPZ99876 [2766078] 
gi|2766076|emb|Z99875.1|SPZ99875 [2766076] 
gi|2766074|emb|Z99874. 1|SPZ99874 [2766074] 
gi|2766072|emb|Z99873. 1|SPZ99873 [2766072] 
gi|2766070|cmb|Z99872. 1|SPZ99872 [2766070] 
gi|2766068|emb|Z99871 . 1|SPZ99871 [2766068] 
gi|2766066|emb|Z99870.1|SPZ99870 [2766066] 
gi|2766064|emb|Z99869. 1|SPZ99869 [2766064] 
gi|2766062|emb|Z99868. 1|SPZ99868 [2766062] 
gi|2766060|emb|Z99867. 1|SPZ99867 [2766060] 
gt|2766058|emb|Z99866.1|SPZ99866 [2766058] 
gi|2766056|emb|Z99865.1|SPZ99865 [2766056] 
gi|2766054|emb|Z99864.1|SPZ99864 [2766054] 
gi|2765906|emb|Z99206. 1 [SPZ99206 [2765906] 
gi|2765904|emb|Z99205. 1 |SPZ99205 [2765904] 
gi|2765902|emb|Z99204.1|SPZ99204 [2765902] 
gi|2765900|emb|Z99203.1|SPZ99203 [27659.00} - 
gi|2765898|emb|Z99202.1|SPZ99202 [2765898] 
gi|2765896|emb|Z99201.1|SPZ99201 [2765896] 
gi|2765894|emb|Z99200. 1 |SPZ99200 [2765894] 
gi|270863 1 |gb| AF03695 1 . 1 1 AF03695 1 [270863 1 ] 
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gi|886956|emb|Z49097,l|SPCSU 12X [886956] 

gi|2656093|gb|L2 1 856. 1 (STRMALR [2656093] 

gi|2576332|emb|AJ002055. 1 [SPSPSA47 [2576332] 

gi|2576330|emb|A J002054. 1 [SPSPSA2 [2576330] 

gi|25 1 1 704|emb|Y 1 08 18.1 |SPY1 08 1 8 (25 1 1 704] 

gi| 19446 1 9|emb|Z83335. 1 |SPZ83335 [ 1 9446 1 9] 

gi|2425 1 08|gb|AF0 1 9904. 1 1 AFO 1 9904 [2425 1 08 ] 

gi|23 85404|emb|A J00 1 246. 1 |SP7465RR5 
[2385404] 

gi|438213|emb|Z16082.1|PNALIB [438213] 

gi|2149613|gb|U90721.1|SPU90721 [2149613] 

gi[4939 1 |emb|Z2 1 84 1 . 1 |SPPBP2BB [4939 1 ] 

gi|2209207|gb|AF004325.1|AF004325 [2209207] 

gi|2293061|emb|Z95914.l|SPZ95914 [2293061] 

gi|2276393|gb|U16156.1|SPU16156 [2276393] 

gi|2 1 833 1 4|gb|AF003930. 1 [AF003930 [2 1 833 1 4] 

gi|2182093|emb|X95717.1|SPPARECGN 
[2182093] 

gi|984230|emb|Z49095.1|SPCSl 1 1 1A [984230] 

gi[886954|emb|Z49096.1|SPCS1092X [886954] 

gi| 1 1 8 1 6 13 |dbj[D82873 . 1 |STRPBP2BE [1181613] 

gi| 1 1 8 1 6 12|dbj|D8287 1 . 1 |STRPBP2BCZ 
[1181612] 

gi|1181611|dbj|D82870.1|STRPBP2BB2 [1181611] 

gi| 1 1 8 1 579|dbj|D82869. 1 |STRPBP2BA 1 [ 1 1 8 1 579] 

gi|l 181 192|dbj|D82872.1|STRPBP2BD [1181192] 

gi|575595|dbj|D42075.1|STRPBP2B2 [575595] 

gi|133997i|dbj|D42074.1|STRPBP2Bl [1339971] 

gi|2 1 08329|emb|Y 1 1463. 1 |SPDNAGCPO 
[2108329] 

gi| 1 944 1 1 5|dbj|AB002522. 1 1 AB002522 [ 1 944 1 1 5] 
gi| 1 666669|emb|Z77727. 1 |SPIS 1 3 8 1 C [ 1 666669] 
g i| 1 666668|emb|Z77726. 1 1 SPIS 1 3 8 1 B [ 1 666668] 
gi| 1 666667|emb|Z77725. 1 |SPIS 1 3 8 1 A [ 1 666667] 
gi| 1 9 14873|emb|Z82002. 1 |SPZ82002 [ 1 9 14873] 
gi| 143 1 584|emb|Z74778. 1 |SPDHFR [1431584] 
gi|47452|emb|Z15120.1|SPSTRG [47452] 
gi|581717|emb|Z12159.1|SPCP131G [581717] 
gi|47342|emb|Xl 7337. 1 |SPAMILOC [47342] 
gi|1800300|gb|U83667.1|SPU83667 [1800300] 
gil 1 532066|emb| Y07780. 1 (SPTETOGEN [1532066] 



gi|l 161269|gb|L39074.1]STRSPXB [1 161269] 

gi|1460093|emb|X94909.1|SPIGAlPRT [1460093] 

gi| 1 750263|gb|U72720. 1 |SPU72720 [ 1 750263] 

gi|298649|gb|S56948. 1 |S56948 [298649] 

gi|254537|gb|S435 11.1 |S435 1 1 [254537] 

gi|245227|gb|S8 1 05 1 . 1 |S8 1 05 1 [245227] 

gi|245226|gbIS8 1 045. 1 |S8 1 045 [245226] 

gi|245225|gb|S8 1 043 . 1 |S8 1 043 [245225] 

gi|l 150618|emb|Z49988.1|SPMMSAGEN 
[1150618] 

gi|47456|emb|X0t 138.1|SPTN917A [47456] 

gi|1658316|emb|Z47210.1|SPDEXCAP [1658316] 

gi| 1 550802|emb|X95385. 1 |SPCOMCGEN 
[1550802] 

gi|47457|emb|X01 1 37. 1 |SPTN9 1 7B [47457] 

gi|9757 14|emb|X9094 1 .1 |SPTRJ525 1 [9757 14] 

gi|9757 1 3|emb|X90940. 1 |SPTU525 1 [9757 13] 

gi|975709|emb|X90939. 1 |SPDNATETM [975709] 

gi|1524346|emb|Z79691.1|SOORFS [1524346] 

gi|1553054|emb|X98364.1|SPPBPHU9 [1553054] 

gi|1553052|einb|X98367.1|SPPBPHU13 [1553052] 

gi|1553050|emb|X98366.1|SPPBPHU12 [1553050] 

gi|1553048|emb|X98365.1|SPPBPHUll [1553048] 

gi| 1 575029|gb|U53509. 1 |SPU53509 [ 1 575029] 

gi| 1542968|gb|U49088. 1 |SPU49088 [ 1 542968] 

gi| 1 542966|gb|U49087. 1 |SPU49087 [ 1 542966] 

gi|1536961|emb|Y07845.1|SPGYRA [1536961] 

gi|47391|emb|X16367.1|SPPBPX [47391] 

gi|1490398|emb|Z67739.1|SPPARCETP [1490398} 

gi| 1490395|emb|Z67740. 1 |SPGYRBORF 
[1490395] 

gi| 143 1 589|emb|Z74777. 1 |SPTMRDHFR 
[1431589] 

gi|408145|emb|Z21702.i|SPUNGMUTX [408145] 
gi|47461|emb|X61025.1|SPXISINT [47461] 
gi|47459|emb|X5565 1 . 1 |SPUNGG [47459] 
gi|47454|emb|X52632.1|SPT1545E [47454] . _ 
gi|47421|emb|Z17307.1|SPRECA [434211 
gi|474 1 9|emb|X67873 . 1 |SPPONA8 [474 1 9] 
gi|47417|emb|X67872.1|SPPONA7 [47417] 
gi|474 1 5|emb|X6787 1 . 1 [SPPONA6 [474 1 5] 
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gi|474 1 3|emb|X67870. 1 |SPPONA5 [474 1 3] 


g 


|4733 1 |emb|X65 1 33. 1 |SP577PBPX [4733 1 ] 


gi|474 1 1 |emb|X67869. 1 |SPPONA4 [474 1 1] 


g 


L|559527|emb|X65136.1|SP110PBPX [559527] 


gi|47409|emb|X67867. l|SPPONA2 [47409] 


g 


|3 1 14 1 5|emb|Z22807. 1 |SP 1 6SRNAA [311415] 


gi|47407|emb|X67866.1|SPPONAl [47407] 


g 


[47329|emb|X65 135.1 |SP53 1PBPX [47329] 


gi|47405|emb|X67868.1|SPPNA3 [47405] 


g 


i|47307|emb|X65 131.1 |SP290PBPX [47307] 


gi|47403|emb|X52474. 1 |SPPLY [47403] 


g 


|47295|emb|X583 1 2. 1 |SP 1 6SRNA [47295] 


gi|984232|emb|X 1 6022. 1 |SPPENA [984232] 


g 


|854614|emb|Z49109.1|SPGADAGN [854614] 


gi|5 171 90|embIX782 15. 1 |SPPBPXG [5 17190] 


g 


|556428|gb|L36660.1|STRORFl [556428] 


gi|295840|emb|Z22230.1|SPPBP2BBA [295840] 

Oil 1 1 ** J 


g 


l|5 1 1062|emb|Z35 135, i|SPALIAG [5 1 1062] 


gi|28898 1 |emb|Z22 185.1 |SPPBP2B AC [28898 1 ] 


g 


L|1208737|gb|U47625.1|SPU47625 [1208737] 


gi|288979|emb|Z22184.1|SPPBP2BAB [288979] 


g 


|530062|gb|U12567.1|SPU12567 [530062] 


gi|288466|emb|Z2 1981.1 |SPPBP2B AA [288466] 


g 


i|153656|gb|M29686.1|STRHEXB [153656] 


gi|49390|emb|Z21813.1|SPPBP2XD [49390] 


g 


|153654|gb|M18729.1|STRHEXA [153654] 


gi|49389|emb|Z2 1812.1 |SPPBP2XC [49389] 


g 


i|153608|gb[M14339.1|STRDPN2A [153608] 


gi|49387|emb|Z2!81 1.1|SPPBP2BJ [49387] 


g 


i|153605|gb|M14340.1|STRDPNlA [153605] 


gi|49385|emb|Z2 1810.1 [SPPBP2BI [49385] 


g 


i|643543|gb|U20084. 1 |SPU20084 [643543] 


gi|49382|emb|Z21808.1|SPPBP2BH [49382] 


g 


i|64354 1 |gb|U20083 . 1 [SPU20083 [64354 1 ] 


gi|49380|emb|Z2 1 807. 1 |SPPBP2BG [49380] 


g 


L|643539|gb|U20082.1|SPU20082 [643539] 


gi|49379|emb|Z2 1 806. 1 |SPPBP2BF [49379] 


g 


L|643537|gb|U20081.1|SPU20081 [643537] 


gi|49377|emb|Z2 1 805. 1 |SPPBP2BE [49377] 


g 


L|643535|gb|U20080.1|SPU20080 [643535] 


gi|49376|emb|Z2 1 804. 1 |SPPBP2XB [49376] 


g 


L|643533|gb|U20079.1|SPU20079 [643533] 


gi|49375|emb|Z2 1 803. 1 |SPPBP2XA [49375] 


g 


i|643531|gb|U20078.1|SPU20078 [643531] 


gi|49374|emb|Z21 802. 1 |SPPBP2BD [49374] 


g 


i|643529|gb|U20077. 1 [SPU20077 [643529] 


gi|49372|emb|Z2 1 80 1 . 1 [SPPBP2BC [49372] 


g 


i|643527|gb|U20076.1|SPU20076 [643527] 


gi|49369|emb|Z2 1 799. 1 |SPPBP2B A [49369] 


g 


i|643525|gb|U20075. 1 |SPU20075 [643525] 


gi|47399|emb|X 13137.1 [SPPENASE [47399] 


g 


L|643523|gb|U20074.1|SPU20074 [643523] 


gi|47397|emb|X13136. 1|SPPENARE [47397] 


g 


i|643521|gb|U20073.1|SPU20073 [643521] 


gi|1052802|emb|X83917.1|SPGYRBG [1052802] 


g 


16435 1 9|gb|U20072. 1 (SPU20072 [6435 1 9] 


gi|587550|emb|X72967.1|SPNANA [587550] 


g 


i|6435 1 7|gb|U2007 1 . 1 (SPU2007 1 [6435 1 7] 


gi|49384|emb|Z21809.1|SPPBPlAB [49384] 


g 


16435 1 5|gb|U20070. 1 [SPU20070 [6435 1 5] 


gi|4937 1 |emb|Z2 1 800. 1 |SPPBP1 AA [4937 1 ] 

©I I I 1 L J 


g 


i|6435 1 3|gb|U20069. 1 [SPU20069 [6435 1 3] 


gi|984228|emb|Z49094.1|SPCS1091A [984228] 


g 


i|64351 l|gb|U20068.1|SPU20068 [64351 1] 


gi|47372|emb|X54225. 1| SPEND A [47372] 


g 


L|643509|gb|U20067. 1 [SPU20067 [643509] 


eil8065901emblZ49246 1ISP667SOD T8065901 


& 


i| 1 0 1 7802|gb|U375 60. 1 |SPU37560 [101 7802] 


gi|407 1 72|emb|Z2685 1 . 1 |SPATPAS2 [407 1 72] 


g 


t|663277|gb|M36 1 80. 1 |STRC0MAA [663277] 


gi|407 1 66|emb|Z26850. 1 |SPATPAS 1 [407 1 66] 


g 


i|437704|gb|L20670. 1 |STRHYALURO [437704] 


gi|47353|emb|X63602.1|SPBOX [47353] 


g 


1 1 53849|gb|L0775 1 . 1 |TRNTO525»7f53849] 


gi|47348|emb|X05577. 1 |SPAPHA3 [47348] 


g 


i| 1 53855|gb|M255 19.1 |STRVA 1 [ 1 53855] 


gi|47337|emb|X65 132.1 |SP824PBPX [47337] 


g 


1 1 53853|gb|M802 15.1 |STRUVS402 A [ 1 53 853] 


gi|47335|emb|X65134.1|SP669PBPX [47335] 


g 


|153848|gb|L07750.1|STRTN5252L [153848] 
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gi| 1 53840|gb|M74 1 22. 1 |STRSURPROA [ 1 53 840] 
gi| ! 53796|gb|M60763. 1 (STRRRNAA [1 53796] 
gi| 1 5379 1 |gb|M3 1296.1 |STRRECP [ 1 5379 1 ] 
gi|5 1 6639|gb|L20556. 1 |STRPLP A [5 1 6639] 
gi| 1 53783|gb|M28679. 1 |STRPROMB ( 1 53783] 
gi| 1 53782|gb|M28678. 1 |STRPROMA [ 1 53782] 
gi| 1 53766|gb|M90527. 1 |STRPONA [ 1 53766] 
gi| 1 53764|gb| J04479. 1 jSTRPOLA [ 1 53764] 
gi| 1 53752|gb|M255 15.1 (STRNG4369 [ 1 53752] 
gi|153722|gb|L0861 l.l|STRMLTODX [153722] 
gi|153702|gb|J01796.1|STRMALMXP [153702] 
gi|153701|gb|J01795.1|STRMALMX [153701] 
gi| 1 53693|gb|M 13812.1 |STRLYTPN [ 1 53693] 
gi|153691|gb|M17717.1|STRLYS [153691] 
gi|153667|gb|M25525.1|STRKAG73 [153667] 
gi|398102|gb|L20564.1|STREXP9B [398102] 
gi|398100|gb|L20563.1|STREXP9A [398100] 
gi|398098|gb|L20562. 1 |STREXP8A [398098] 
gi|398096|gb|L20561 . 1 |STREXP7A [398096] 
gi|398094|gb|L20560. 1 (STREXP6A [398094] 
gi|398092|gb|L20559. 1 |STREXP5A [398092] 
gi|398090|gb|L20558. 1 |STREXP4A [398090] 
gi|153626|gb|J04234.1|STREXOA [153626] 
gi|153612|gb|M11226.1|STRDPNM [153612] 
gi| 1 53603|gb|M2552 1 . 1 |STRDN87669 [ 1 53603] 
gi| 1 5360 1 |gb|M25526. 1 |STRDN87577 [ 1 5360 1 ] 
gi| 1 53599|gb|M25522. 1 |STRDN 1 79 [ 1 53599] 
gi|153594|gb|M37688.1|STRDACA [153594] 
gi|153582|gb|L07752.1|STRATTB [153582] 
gi|4665 14|gb|L3 1413. 1 |STR1RRA [4665 14] 
gi|153551|gb|M25520.1|STR8249 [153551] 
gi| 1 53549|gb|M25524. 1 |STR53 13972 [ 1 53549] 
gi|153547|gb|M25517.1|STR29044 [153547] 
gi| 1 53545|gb|M25523. 1 |STR1 8 1 07 1 [ 1 53545] 
gi|153541|gb|M25518.1|STR121 [153541] 
gi|153539|gb|M25516.1|STR110K70 [153539] 
gi|506632|gb|U04047.1|SPU04047 [506632] 
gi|393267tgb|L19055,l|STRPAPA [393267] 
gi|442066|gb|S62272. 1 |S62272 [442066] 
gi|295191|gb|L15190.1|STRPURISYN [295191] 
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CLAIMS 

What is claimed is; 

5 1 . A method for identifying a bacteriophage coding region encoding a 

product active on an essential bacterial target, comprising identifying a nucleic acid 
sequence encoding a gene product which provides a bacteria-inhibiting function when 
said bacteriophage infects a host bacterium, 

wherein said bacteriophage is uncharacterized and said host bacterium 
10 is a pathogenic bacterium. 

2. The method of claim 1, further comprising expressing a recombinant 
bacteriophage ORF in cells of a bacterial strain, wherein inhibition of said cells 
following expression of said ORF is indicative that said product is active on an 

1 5 essential bacterial target. 

3. The method of claim 2, wherein inhibition of said bacterium following 
expression of said ORF is determined by comparison with the growth or viability of 
said bacterium following expression of an inactivated mutant form of said ORF or in 

20 the absence of expression of said ORF, and wherein inhibition of said bacterium 
following expression of said ORF is indicative that said product is active on an 
essential bacterial target. 

4. The method of claim 2, wherein expression of said ORF is inducible. 

25 

5. The method of claim 1, further comprising sequencing at least a 
portion of a bacteriophage genome. 

6. The method of claim 1, wherein at least a portion of the nucleotide 
30 sequence of a bacteriophage genome is known, said method further comprising 

identifying at least one ORF in said portion by computer analysis of said sequence. 

7. The method of claim 6, further comprising analyzing the sequence of 
said at least one ORF or of a polypeptide encoded by said ORF to identify 

35 homologous genes or gene products of known biochemical function, thereby- 
indicating the biochemical function of said polypeptide. 



WO 00/32825 



PCT/IB99/02040 



431 

8. The method of claim 7, wherein said homologous gene or gene product 
is a bacterial gene important for cell viability. 

9. The method of claim 7, wherein said homologous gene or gene product 
5 is a gene or gene product known to have a bacteria-inhibiting function. 

10. The method of claim 6, further comprising analyzing the sequence of 
said at least one ORF or of a polypeptide encoded by said ORF to identify structural 
motifs in said polypeptide, thereby indicating the cellular function of said polypeptide. 

10 

1 1 . The method of claim 1 , wherein a host bacterium for said 
bacteriophage is selected from the species group consisting of bacteria listed in Table 
1. 

15 12. The method of claim 1, wherein said bacteriophage is selected from the 

group consisting of uncharacterized bacteriophage listed in Table 1. 

13. The method of claim 2, wherein a plurality of bacteriophage ORFs are 
expressed in at least one bacterium. 

20 

14. The method of claim 13, wherein each of said plurality of 
bacteriophage ORFs is expressed in a different bacterium. 

1 5 . The method of claim 1 4, wherein said plurality of bacteriophage ORFs 
25 comprises at least 10% of the ORFs in the genome of said bacteriophage. 

16. The method of claim 1, wherein said pathogenic bacterium is an animal 
pathogen. 

30 17. The method of claim 1 6, wherein said pathogenic bacterium is a human 

pathogen. 

18. The method of claim 1, wherein said pathogenic bacterium is a plant 
pathogen. 

35 - ~ * 

19. The method of claim 1, further comprising confirming the inhibitor 
function of said ORF. 
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20. The method of claim 1 9, wherein said confirming comprises 
expressing a loss-of- function mutant form of said ORF in said host bacterium. 

5 21. The method of claim 1 , wherein said identifying a nucleic acid 

sequence encoding a gene product active on an essential bacterial target comprises 
identifying a nucleic acid sequence encoding a homo log of a bacteriophage 
polypeptide known to be active on an essential bacterial target. 

10 22 . The method of claim 1 , wherein said identifying a bacteriophage 

coding region comprises identifying a first coding region from a bacteriophage having 
a non-pathogenic host bacterial strain related to said pathogenic bacterium, said first 
coding region encoding a product active on an essential bacterial target; and 
identifying a homolog of said first coding region, wherein said 

1 5 homolog is a probable said bacteriophage coding region encoding a product active on 
an essential bacterial target. 

23. The method of claim 2, wherein a plurality of bacteriophage ORFs 
from a plurality of different bacteriophage are expressed in at least one bacterium. 

20 

24. The method of claim 23, wherein each of said plurality of 
bacteriophage ORFs are expressed in different bacteria. 

25 25. A method for identifying a target for antibacterial agents, comprising 

determining the bacterial target of an uncharacterized bacteriophage inhibitor protein. 

26. The method of claim 25, wherein said determining comprises 
identifying at least one bacterial protein which binds to said bacteriophage inhibitor 

30 protein or a fragment thereof. 

27. The method of claim 26, wherein said binding is determined using 
affinity chromatography on a solid matrix. 



35 
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29. The method of claim 28, wherein said genetic screen is a yeast two- 
hybrid screen. 

30. The method of claim 25, wherein said determining comprises a co- 
5 immunoprecipitation assay or a protein-protein crosslinking assay. 

3 1 . The method of claim 25, wherein said determining comprises 
identifying a mutated bacterial coding sequence which protects a bacterium from said 
bacteriophage inhibitor. 

10 

32. The method of claim 25, wherein said determining comprises 
identifying a bacterial coding sequence which protects a bacterium against said 
bacteriophage inhibitor when expressed at high levels in said bacterium. 

15 33. The method of claim 25, wherein said determining further comprises 

identifying a bacterial nucleic acid sequence encoding a polypeptide target of said 
bacteriophage inhibitor protein. 

34. The method of claim 33, wherein said nucleic acid sequence is 

20 identified by determining at least a portion of the amino acid sequence of a bacterial 
protein target, and identifying a bacterial nucleic acid sequence which encodes said 
protein target. 

35. The method of claim 25, wherein said bacterial target is naturally 
25 produced by a bacterial species selected from the group consisting of species of the 

genera listed in Table 1 . 

36. The method of claim 25, wherein said bacterial target is naturally 
produced by a bacterial strain selected from the group consisting of species listed in 

30 Table 1. 

37. The method of claim 25, wherein said inhibitor protein is naturally 
produced by a bacteriophage selected from the group consisting of uncharacterized 
bacteriophage listed in Table 1 . 



38. The method of claim 25, further comprising identifying a 
bacteriophage ORF which encodes a product having a bacteria-inhibiting function. 
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39. The method of claim 38, wherein said identifying a phage ORF 
comprises expressing at least one bacteriophage ORF in a bacterium, wherein 
inhibition of said bacterium following said expression is indicative that said ORF 

5 encodes a bacteria-inhibiting function. 

40. The method of claim 39, wherein a plurality of bacteriophage ORFs are 
expressed in at least one bacterium. 

10 41. The method of claim 40, wherein each of said plurality of 

bacteriophage ORFs is expressed in a different bacterium. 

42 . The method of claim 4 1 , wherein said plurality of bacteriophage ORFs 
comprises at least 10% of the ORFs in the genome of said bacteriophage. 

15 

43. The method of claim 25, wherein said determining the bacterial target 
of a bacteriophage inhibitor protein is performed for a plurality of different 
bacteriophage of the same host bacterium. 

20 44. The method of claim 25, wherein said bacterial target originates from 

an animal pathogen. 

45. The method of claim 44, wherein said bacterial target is a gene 
homologous to a gene from an animal pathogen. 

25 

46. The method of claim 44, wherein said pathogen is a human pathogen. 

47. The method of claim 25, wherein said bacterial target originates from a 
plant pathogen. 

30 

48. The method of claim 25, wherein said bacterial target is a gene 
homologous to a gene from a plant pathogen. 

49. The method of claim 25, further comprising determining the cellular pr . T 
35 biochemical function or both of said inhibitor protein. 
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50. The method of claim 25, wherein said identifying the bacterial target 
comprises identifying a phage-specific site of action. 

5 5 1 . An isolated, purified, or enriched nucleic acid sequence at least 1 5 

nucleotides in length, wherein said sequence corresponds to at least a portion of a 
bacteriophage sequence, and wherein said bacteriophage is selected from the group 
consisting of Staphylococcus aureus bacteriophage 77, 3A, 96, and 44AHJD, 
Enterococcus baceriophage 182, and Streptococcus pheumoniae bacteriophage Dp-1. 

10 

52. The nucleic acid sequence of claim 51, wherein said sequence 
comprises at least 50 nucleotides. 

53. The nucleic acid sequence of claim 51, wherein said nucleic acid 

1 5 sequence corresponds to at least a portion of a nucleic acid sequence which encodes a 
product which provides a bacteria-inhibiting function. 

54. The nucleic acid sequence of claim 53, wherein said nucleic acid 
sequence encodes a polypeptide which provides a bacteria-inhibiting function. 

20 

55. The nucleic acid sequence of claim 54, wherein said nucleic acid 
sequence is transcriptionally linked with regulatory sequences enabling induction of 
expression of said sequence. 

25 

56. An isolated, purified, or enriched polypeptide comprising at least a 
portion of a protein providing a bacteria-inhibiting function, wherein said polypeptide 
is normally encoded by a bacteriophage selected from the group consisting of 
Staphylococcus aureus bacteriophage 77, 3A, 96, and 44AHJD, Enterococcus 

30 baceriophage 182, and Streptococcus pheumoniae bacteriophage Dp-1 . 

57. The polypeptide of claim 56, wherein said polypeptide provides said 
bacteria-inhibiting function. 

35 58. The polypeptide of claim 56, wherein said polypeptide comprises a 

portion at least 10 amino acid residues in length of a said polypeptide normally 
encoded by said bacteriophage. 
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59. A recombinant vector comprising a bacteriophage ORF corresponding 
to an ORF from a bacteriophage having a pathogenic bacterial host, wherein said 

5 bacterial host is selected from the group consisting of uncharacterized bacteria of 
Table 1. 

60. The vector of claim 59, wherein said vector is an expression vector. 

10 61. The vector of claim 59, wherein said bacteriophage is selected from the 

group consisting of uncharacterized bacteriophage of Table 1. 

62. The vector of claim 61, wherein said bacteriophage is selected from the 
group consisting of Staphylococcus aureus bacteriophage 77, 3A, 96, and 44AHJD, 

1 5 Enterococcus baceriophage 1 82, and Streptococcus pheumoniae bacteriophage Dp- 1 . 

63. The vector of claim 60, wherein expression of said ORF is inducible. 

20 64. A recombinant cell comprising a vector, wherein said vector comprises 

an ORF from a bacteriophage having a pathogenic bacterial host, wherein said 
bacterial host is selected from the group consisting of bacterial species of Table 1. 

65. The recombinant cell of claim 64, wherein said bacteriophage is 
25 selected from the group consisting of uncharacterized phage of Table 1. 

66. The cell of claim 65, wherein said bacteriophage is selected from the 
group consisting of Staphylococcus aureus bacteriophage 77, 3 A, 96, and 44AHJD, 
Enterococcus baceriophage 182, and Streptococcus pheumoniae bacteriophage Dp-1. 

30 

67. The cell of claim 64, wherein said vector is an expresssion vector and 
expression of said ORF is inducible. 

35 68. A method for identifying an antibacterial agent, comprising identffying 

an active portion of a product of a bacteria-inhibiting ORF of a bacteriophage. 
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69. The method of claim 68, further comprising constructing a synthetic 
peptidomimetic molecule, wherein the structure of said molecule corresponds to the 
structure of said active portion. 

5 

70. A method for identifying a compound active on a target of a 
bacteriophage inhibitor protein, comprising the step of 

contacting a bacterial target protein with a test compound; and 
determining whether said compound binds to or reduces the level of 
10 activity of said target protein, 

wherein binding of said compound with said target protein or a 
reduction of the level of activity of said protein is indicative that said compound is 
active on said target and wherein said target is uncharacterized. 

15 71. The method of claim 70, wherein said contacting is carried out in vitro. 

72. The method of claim 70, wherein said contacting is carried out in vivo 
in a cell. * 

20 73. The method of claim 70, wherein said compound is a small molecule. 

74. The method of claim 70, wherein said compound is a peptidomimetic 
compound. 

25 75. The method of claim 70, wherein said compound is a fragment of a 

bacteriophage inhibitor protein. 

76. The method of claim 70, further comprising determining the site of 
action of said compound on said target protein. 

30 

77. The method of claim70, wherein said contacting is performed for a 
plurality of said target proteins. 



35 
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wherein said target is naturally produced by a pathogenic bacterium. 

79. The method of claim 78, wherein said plurality of compounds are 
small molecules. 



5 



80. The method of claim 78, wherein said determining is performed for a 
plurality of said targets. 



10 81. A method for inhibiting a bacterium , comprising the step of; 

contacting said bacterium with a compound active on a target of a 
bacteriophage inhibitor protein, wherein said target or the target site is 
uncharacterized. 

15 82. The method of claim 81, wherein said compound is said protein or an 

active fragment thereof. 

83. The method of claim 8 1 , wherein said compound is a structural 
mimetic of said protein. 

20 

84. The method of claim 81, wherein said compound is a small molecule. 

85. The method of claim 81, wherein said contacting is performed in vitro. 

25 86. The method of claim 8 1 , wherein said contacting is performed in vivo 

in an animal. 

87. The method of claim 86, wherein said animal is a human. 

30 88. The method of claim 8 1 , wherein said contacting is carried out in vivo 

in a plant. 

89. The method of claim 81, wherein said bacterium is selected from the 
group of bacteria listed in Table 1. 

35 
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90. A method for treating a bacterial infection in an animal suffering from 
an infection, comprising administering to said animal a therapeutically effective 
amount of compound active on a target of a bacteriophage inhibitor protein in a 
bacterium involved in said infection, 

5 wherein said target is an uncharacterized target or the compound is active at an 

uncharacterized target site. 

91. The method of claim 90, wherein said compound is a small molecule. 

10 92. The method of claim 90, wherein said compound is a peptidomimetic 

compound. 

93. The method of claim 90, wherein said compound is a fragment of a 
bacteriophage inhibitor protein. 

15 

94. The method of claim 90, wherein said animal is a mammal. 

95. The method of claim 94, wherein said mammal is a human. 

20 96. The method of claim 90, wherein said bacterium is selected from the 

group listed in Table 1. 

97. The method of claim 90, wherein said bacteriophage inhibitor protein 
is from a bacteriophage selected from the group of bacteriophage listed in Table 1. 

25 

98. A method for propylactically treating an animal at risk of an infection, 
comprising administering to said animal a prophylactically effective amount of a 
compound active on a target of a bacteriophage inhibitor protein, 

30 wherein said target is an uncharacterized target or the site of action of 

said compound is an uncharacterized target site. 

99. The method of claim 98, wherein said compound is a small molecule. 

35 100. The method of claim 98, wherein said compound is a peptidomimetic 

compound. 
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101. The method of claim 98, wherein said compound is a fragment of a 
bacteriophage inhibitor protein. 

102. The method of claim 98, wherein said animal is a mammal. 

5 

103. The method of claim 102, wherein said mammal is a human. 

104. An antibacterial agent active on a target of a bacteriophage inhibitor 
10 protein, wherein said target is an uncharacterized target or said agent is active at a 

phage-specific site on said target. 

105. The agent of claim 104, wherein said agent is a pepetidomimetic of a 
bacteriophage inhibitor polypeptide. 

15 

106. The agent of claim 104, wherein said agent is a small molecule. 

107. The agent of claim 104, wherein said agent is a fragment of a 
bacteriophage inhibitor polypeptide. 

20 

108. The agent of claim 104, wherein said agent is active at a phage-specific 
site on said target. 

25 109. A method of making an antibacterial agent, comprising the steps of: 

a) identifying a target of a bacteriophage inhibitor polypeptide; 

b) screening a plurality of test compounds to identify a compound 
active on said target; and 

c) synthesizing said compound in an amount sufficient to provide a 
30 therapeutic effect when administered to an organism infected by a bacterium naturally 

producing said target. 

110. The method of claim 109, wherein said compound is a small molecule. 

35 11 1. The method of claim 109, wherein said compound is a peptidomimetic 

compound. 
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1 12. The method of claim 109, wherein said compound is a fragment or 
derivative of a bacteriophage inhibitor protein. 

5 1 13. A computer readable device having recorded therein a nucleotide 

sequence of a portion of at least one bacteriophage genome of Staphylococcus aureus 
bacteriophage 77, bacteriophage 3 A, or bacteriophage 96, a nucleotide sequence at 
least 95% identical to a said nucleotide sequence, a ribonucleic acid equivalent, a 
degenerate equivalent, a homologous sequence, or at least one amino acid sequence 
1 0 encoded by said nucleotide sequence; and 

a nucleotide sequence or amino acid sequence analysis program, 
wherein said program can perform at least one sequence analysis on said 
nucleotide or amino acid sequence. 

15 114. The device of claim 113, wherein said at least a portion of at least one 

bacteriophage genome comprises at least one ORF. 

115. The device of claim 113, wherein said device comprises a medium 
selected from the group consisting of floppy disk, computer hard drive, optical disk, 

20 computer random access memory, and magnetic tape wherein said nucleotide or 
amino acid sequence or said program or both are recorded on said medium. 

1 16. The device of claim 113, wherein said portion of at least one 
bacteriophage genomic nucleotide sequence comprises at least 50% of at least one 

25 bacteriophage genomic sequence. 

1 17. The device of claim 113, wherein said at least one bacteriophage 
nucleotide genomic sequence comprises portions of a plurality of bacteriophage 
nucleotide genomic sequences, 

30 

118. A computer-based system for identifying biologically important 
portions of a bacteriophage genome, comprising: 

a) a data storage medium having recorded thereon a nucleotide sequence 
35 corresponding to a portion of at least one bacteriophage genome, wherein said 
bacteriophage genome is uncharacterized; 
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b) a set of instructions allowing searching of said sequence to analyze said 
sequence; and 

c) an output device. 

119. The system of claim 118, wherein said output device comprises 
comprises a device selected from the group consisting of a printer, a video display, 
and a recording medium. 

120. The system of claim 118, wherein said bacteriophage genome is of a 
bacteriophage selected from the group consisting of uncharacterized bacteriophage 
listed in Table 1. 

121. The system of claim 118, wherein said uncharacterized bacteriophage 
is selected from the group consisting of bacteriophage 77, 3 A, and 96. 

122. A method for identifying or characterizing a bacteriophage ORF, 
comprising the steps of: 

a) providing a computer-based system for analyzing nucleic acid or 
amino acid sequence data, wherein said system comprises a data storage medium 
having recorded thereon at least one nucleotide or amino acid sequence corresponding 
to a portion of at least one uncharacterized bacteriophage genome, a set of instructions 
allowing searching of said sequence to analyze said sequence; and an output device; 

b) analyzing at least a portion of at least one said sequence; and 

c) outputting results of said analyzing to said output device. 

123. The method of claim 122, wherein said analysis identifies sequence 
similarity or homology with sequences selected from the group consisting of bacterial 
ORFs encoding products with related biological function; ORFs encoding known 
inhibitors or bacteria, essential bacterial ORFs. 

124. The method of claim 122, wherein said analysis comprises identifying 
a probable biological function based on identification of structural elements or 
sequence homology or similarity. - . 

125. The method of claim 122, wherein said bacteriophage is selected from 
the group consisting of uncharacterized bacteriophage listed in Table 1. 



WO 00/32825 



PCT/IB99/02040 



443 

126. The method of claim 125, wherein said uncharacterized bacteriophage 
is selected from bacteriophage 77, 3A, and 96. 
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FIG. IB. 




PCR of pT002l with XhoF & BamHNR 



Xhof • 5"-AATT CTCGAGT AAAA TAACAT-3' 
Xhol 

Hind III 



I 



AAATCAGGTGACTGTTGAGAAAAGGAGGCGGATCCCG-BamHNR 
Stop of RBS BamHI 
arsR 



Modified between stop 
of arsR to BamHI 



Digestion with 
Xhol & BamH\ 




Ligation 



PCR of pT0021 with LucFFB & LucFFH 

LucFFB -S'-CGGG^rCCATGAGGGGTTCCGAAGACG 
BamHI f* n c °t Original Sa/nH/ 
LucFF was modified 



Hind Hi 



Digestion with 
BamHI & Hindlll 



GAMGTCCAAATTGIA4GC77GGG-LucFFH 
Stop of 



LucFF 



HlndlW 



Ligation 



CTCGAG- 
Xhd 



Modified in the 
vicinity of BamHI 

Cloning site for ORFs: 
BamHI & Hindlll 

No additional codons 
in the induced protein 



arsR LucFF 

(TGAlGAAAAGGAGGCGGATCCfATGl 

RBS BamHI 




Hindlll 



- [taagI ctt 

Hihtflll 
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FIG. 2. 




Verification of pTHA/ORF clones 
by PCR and sequencing 
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FIG. 3. 

(A) Functional assay on semi-solid support media 

Frozen stock of phage 77 pTHA/ORF S. aureus RN4220 transformants 



1:10 and 1:100 dilution in saline solution 
5^1 of 1:10 dilution 3^1 of 1:10 and 1:100 dilution 



Streak onto agar plates containing 
0, 2.5, 5, and 7.5 ^lM NaAs02 



Spot onto agar plates containing 
0, 2.5, 5, and 7.5 nM NaAs02 



O/N, 37'C 

Compare bacterial growth on plates with and without NaAs0 2 



(B) 



Functional assay in liquid medium 

O/N culture inoculated from frozen stock of 

phage 77 pTHA/ORF S. aureus RN4220 transformants 



1:100 dilution of O/N culture 

|2h ( 37*C,250rpm 

Fresh culture 
1 150 \i\ 

2.5 ml containing 0 and 5 pM NaAs0 2 

I 3.5 h, 37'C, 250 rpm 




Measure OD 565 



1:10 serial dilution from 10* 1 to 10* 6 

. | 20filof 10* 4 _to10'- 6 - 

Spot onto agar plate 
| O/N, 37'C 
Count colonies 
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FIG. 7. 

Abbreviations: 

kan: gene encoding kanamycin resistance 

cat: gene encoding chloramphenicol resistance 

ori + and -: origin of replication in gram-positive and 

gram-negative bacteria, respectively 

arsR: gene encoding regulatory protein of the ars promoter 

P: ars promoter 

lucFF: gene encoding luciferase protein. This portion will 
be removed and replaced by individual S. aureus phaqe 
genes. 3 
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